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POLYKETTDE-ASSOCIATED SUGAR BIOSYNTHESIS GENES 

This application claims the benefit of U.S. Serial No. 08/576,626 filed December 21, 
1995, now pending. 

5 

Firitfpfttiefavpntiop 
The present invention relates to methods for directing the biosynthesis of specific 
polyketide analogs by genetic manipulation. In particular, sugar biosynthesis genes are 
manipulated to produce precise, novel glycosylation-modified macrolides of predicted 
10 structure. 

Background of the Invention 
Polyketides are a large class of natural products that includes many important 
antibiotic, antifungal, anticancer, and anti-helminthic compounds such as erythromycins, 

IS amphotericins, daunorubicins, and avermectins. Their synthesis proceeds by an ordered 
condensation of acyl esters to generate carbon chains of varying length, side chain, and 
reduction pattern that are differentially cyclized and subsequently modified to give the mature 
polyketides. For many polyketides, maturation includes the addition of one or more sugar 
residues to the cyclized carbon chain. The sugar residues are frequently critical to the 

20 biological activity of the mature polyketide. 

Streptomyces and the closely related Saccharopolyspora genera are prodigious 
producers of polyketide metabolites. Because of the commercial significance of these 
compounds, a great amount of effort has been expended in the study of Streptomyces 
genetics. Consequently, much is known about Streptomyces and several cloning vectors exist 

25 for introducing DNA into these organisms. 

Although many polyketides have been identified, there remains the need to obtain 
novel glycosylation modified (as defined herein) polyketide structures with enhanced 
properties. Current miethods of obtaining such molecules include screening of biological 
samples and chemical modification of existing polyketides, both of which are costly and time 

30 consuming. Current screening methods are based on gross properties of the molecule, i.e. 
antibacterial, antifungal activity, etc., and both a priori knowledge of the structure of the 
molecules obtained or predetermination of enhanced properties are virtually impossible. 
Standard chemical modification of existing structures has been successfully employed, but is 
limited by the number of types of compounds obtainable. Furthermore, the poor yield of 

35 multistep chemical syntheses often limits the practicality of this approach. The following 
modifications to sugar residues bound to polyketides are particularly difficult or inefficient at 
the present time: change the stereochemistry of specific hydroxyl or methyl groups, change 
the oxidation state of specific hydroxyl groups, and deoxygenation of specific carbons. 
Accordingly, there exists a need to obtain molecules wherein such changes are specified and 
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performed which would represent an improvement in the technology to produce altered 
glycosylalion-modified polyketide molecules with predicted structure. 

The present invention overcomes these problems by providing the genetic sequence of 
sugar biosynthesis genes involved in the biosynthesis of polyketide-associated sugars. 

Summarv of the Invention 
In one aspect, the present invention provides an isolated single or double stranded 
polynucleotide, typically DNA, having a nucleotide sequence which comprises (a) a 
nucleotide sequence selected from the group consisting of (i) the sense sequence of FIG. 4 A 
(SEQ ID N0:1) from about nucleotide position 54 to about nucleotide position 1 136; (ii) the 
sense sequence of SEQ ID N0:1 from about nucleotide position 1 147 to about nucleotide 
position 2412; (iii) the sense sequence of SEQ ID N0:1 from about nucleotide position 2409 
to about nucleotide position 3410 ; (iv) the sense sequence of FIG. 4B (SEQ ID N0:2) from 
about nucleotide position 80 to about nucleotide position 1048; (v) the sense sequence of 
SEQ ID N0:2 from about nucleotide position 1048 to about nucleotide position 229S; (vi) the 
sense sequence of SEQ ID N0:2 from about nucleotide position 2348 to about nucleotide 
position 3061 ; (vii) the sense sequence of SEQ ID N0:2 from about nucleotide position 3214 
to about nucleotide position 4677; (viii) the sense sequence of SEQ ID NO:2 from about 
nucleotide position 4674 to about nucleotide position 5879; (ix) the sense sequence of SEQ 
ID N0:2 from about nucleotide position 5917 to about nucleotide position 7386; and (x) the 
sense sequence of SEQ ID N0:2 from about nucleotide position 7415 to about nucleotide 
position 7996; (b) sequences complementary to the sequences of (a); (c) sequences that, on 
expression, encode a polypeptide encoded by the sequences of (a); and (d) analogous 
sequenceis that hybridize under stringent conditions to the sequences of (a) and (b). A 
preferred molecule is a DNA molecule. In another embodiment, the polynucleotide is an 
RNA molecule. 

In another embodiment, a DNA molecule of the present invention is contained in an 
expression vector. The expression vector preferably further comprises an enhancer-promoter 
operatively linked to the polynucleotide. In a preferred embodiment, the DNA molecule in 
the vector is one of the preferred sequences mentioned above. In an especially preferred 
embodiment, the DNA molecule in the vector is the sequence of SEQ ID N0:2 from about 
nucleotide position 80 to about nucleotide position 1048. 

The present invention still further provides for a host cell transformed with a 
polynucleotide or expression vector of this invention. Preferably, the host cell is a bacterial 
ceil selected from the group consisting of Saccharopofyspora spp., Streptomyces spp. and E. 
colL 

The present invention also provides methods to produce novel glycosylation modified 
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polykelide structures by designing and introducing specified changes in the DNA governing 
the synthesis and attachment of sugar residues to polyketides. According to one method, the 
biosynthesis of specific glycosylation-modified polyketides is accomplished by genetic 
manipulation of a polyketide-producing microorganism comprising the steps of isolating a 

5 sugar biosynthesis gene-containing DNA sequence from those described above; identifying 
within the gene-containing DNA sequence one or more DNA fragments responsible for the 
biosynthesis of a polyketide-associated sugar or its attachment to the polyketide; creating one 
or more specified changes into the DNA fragment or fragments, thereby resulting in an 
altered DNA sequence; introducing the altered DNA sequence into a polyketide-producing 

10 microorganism to replace the original sequence whereby the altered DNA sequence, when 
translated, results in altered enzymatic activity capable of effecting the production of the 
specific glycosylation-modified polyketide; growing a culture of the altered polyketide- 
producing microorganism under conditions suitable for the formation of the specific 
glycosylation-modified polyketide; and isolating said specific glycosylation-modified 

15 polyketide from the culture. 

In a second method the biosynthesis of specific glycosylation-modified polyketides is 
accomplished by isolating a sugar biosynthesis gene-containing DNA sequence from from 
those described above; identifying within the gene-containing DNA sequence one or more 
DNA fragments responsible for the biosynthesis of a polyketide-associated sugar or its 

20 attachment to the polyketide; reversing the strand orientation of the DNA fragment or 

fragments, thereby resulting in an altered DNA sequence which, when transcribed, results in 
production of an antisense mRNA; introducing the altered DNA sequence into a polyketide- 
producing microorganism having an mRNA capable of binding to the antisense mRNA which 
results in altered enzymatic activity capable of effecting the production of the specific 

.25 glycosylation-modified polyketide; growing a culture of the altered polykelide-producing 
microorganism under conditions suitable for the formation of the specific glycosylation- 
modified polyketide; and isolating the specific glycosylation-modified polyketide from the 
culture. 

In a third method the biosynthesis of specific glycosylation-modified polyketides is 
30 accomplished by isolating a sugar biosynthesis gene-containing DNA sequence from from 
those described above; identifying within the gene-containing DNA sequence one or more 
DNA fragments responsible for the biosynthesis of a polyketide-associated sugar or its 
attachment to the polyketide; introducing the DNA fragment or fragments into a polyketide- 
producing microorganism whereupon transcription and translation of the DNA fragment or 
35 fragments generate an altered polyketide-producing microorganism that is capable of 
producing the specific glycosylation-modified polyketide; growing a culture of the 
polyketide-producing microorganism containing the DNA fragment or fragments under 
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conditions suitable for the formation of the specific glycosylation-modified polyketide; and 
isolating the specific glycosylation-modified polyketide from the culture. 

Preferably, the sugar biosynthesis gene-containing DNA sequence of the processes 
described above comprises genes which encode an enzymatic activity involved in the 
s biosynthesis of L-mycarose and/or D-desosamine. More preferably* the sugar biosynthesis 
gene-containing DNA sequence comprises the sequence of SEQ ED N0:2 from about 
nucleotide position 80 to about nucleotide position 1 048. 

The present invention is especially useful in manipulating sugar biosynthesis genes 
ifrom Streptomyces and Saccharopolyspora^ organisms that provide over one-half of the 
10 clinically useful antibiotics. 

Brigf Pe$cription of thg Drawings 
FIG. 1 A illustrates the organization of the erythromycin biosynthetic gene cluster and 
the genetic designations of the biosynthetic genes; FIG. IB illustrates an abbreviated 
IS erythromycin biosynthetic scheme that broadly associates the biosynthetic genes with their 
role in erythromycin biosynthesis. Seven eryB genes, eryBI - eryBVII, arc responsible for the 
biosynthesis of L-mycarose or its attachment to the erythronolide B ring, and six eryC genes, 
eryCI - eryCVI, are responsible for the biosynthesis of D-desosamine or its attachment to 3- 
a-mycarosyleiythronolide B. The dashed arrows indicate that the pathway through 
20 erythromycin B is not the principal natural biosynthetic route to erythromycin A. > 

FIG. 2 illustrates the proposed scheme for the biosynthesis of L-mycarose and the 
eryB genes responsible for the specific steps. 

25 FIG. 3 illustrates the proposed scheme for the biosypthesis of D-desosamine and the 

eryC genes responsible for the specific steps. 

FIG. 4A(l-4) illustrates the nucleotide sequence {SEQ ID NO: 1) of the sugar 
biosynthesis genes eryCII (coordinates 54-1 136), eryCIII (coordinates 1 147-2412). and 
30 eryBII (coordinates 2409-3410), with corresponding translation of the open reading frames 
(SEQ ID N0:3, SEQ ID N0:4 and SEQ ID N0:5 respectively). Standard one letter codes for 
the amino acids appear beneath their respective nucleic acid codons as described herein. 

FIG. 4B(l-9) illustrates the nucleotide sequence (SEQ ID N0:2) of the sugar 
35 biosynthesis genes cryB/V (coordinates 80-1048), cryJ8V(coordmates 1048-2295), eryCVI 
(coordinates 2348-3061). eryfiV/ (coordinates 3214-4677), eryC/V (coordinates 4674-5879), 
eryCV (coordinates 5917-7386), and eryBV// (coordinates 7415-7996) with corresponding 
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translation of the putative open reading frames (SEQ ID N0:6, SEQ ID N0:7, SEQ ID 
N0:8, SEQ ID n6:9, SEQ ID NO:10, SEQ ID N0:1 1 and SEQ ID NO:12 respectively). 
Standard one letter codes for the amino acids appear beneath their respective nucleic acid 
codons as described herein. 

5 

FIG. SA illustrates the amino acid sequence identity between die sugar biosynthesis 
enzyme encoded by the eryBIV gene of Sac. erythraea (SEQ ID NO:6) and the sugar 
biosynthesis enzymes encoded by the aycFgene of Yersinia pseudotuberculosis [Thorson et 
aL I BacierioL 176:5483 (1994)], (SEQ ID N0:13), the rjbJ gene oi Salmonella entenca 

10 [Jiang et a/., Mol Microbiol , 5:695 ( 1 99 1 )]. (SEQ ID NO: 1 4), the strL gene of Streptomyces 
griseus [Pissowotzki et al, Mol Gen. Genet. 241:193 (1993)1 (SEQ ID N0:15) and the galE 
gene of Escherichia coli [Lemairc and mXlNucl Acids Res, 14:7705 (1986)] (SEQ ID 
NO: 16), In this and all other Figures in which amino acid sequence identity is compared 
capitalized letters represent consensus (identical) amino acids between species or amino acids 

15 which are conservative substitutions for the consensus residues. Also in each Figure, the 
sequence identified as "consensus" is merely a convenient representation of conserved amino 
acids and is not intended as a representation of any existing polypeptide sequence. 

FIG. 5B illustrates the amino acid sequence identity between the sugar biosynthesis 
20 enzyme encoded by the eryBVlI gene of Sac, erythraea (SEQ ID NO: 12) and the sugar 
biosynthesis enzymes encoded by the strM gene of Streptomyces griseus [Pissowotzki et ai, 
Mol Gen, Genet, 241 : 193 (1993)] (SEQ ID NO: 17), the rfbC gene of Salmonella enterica 
[Jiang et al, Mol Microbiol, 5:695 (1991)] (SEQ ID NO: 18), the ffbF gene of Yersinia 
entercolitica [Zhang et al, Mol Microbiol, 9:309 (1993)] (SEQ ID NO:19), and the ascE 
25 gene of Yersinia pseudotuberculosis [Thorson et al, J. BacterioL, 176:5483 (1994)] (SEQ ID 
NO:20). 

FIG, 5C illustrates the amino acid sequence identity between the sugar biosynthesis 
enzyme encoded by the eryC/V gene of Sac. erythraea (SEQ ID NO: 10) and the sugar 

30 biosynthesis enzymes encoded by the eryCI gene of Sac. erythraea [Dhillon et aL, Mol 
Microbiol, 3:1405 (1989)] (SEQ ID N0:21), the ascC gene of Yersinia pseudotuberculosis 
[Weigel et al. Biochemistry, 31:2129 (1992). Thorson et al, J. Am. Chem. Soc., 1 15:6993 
(1993), Thorson et aL J. Bacterial, 176:5483 (1994)] (SEQ ID NO:22), the dnrj gene of 
Streptomyces peucetius [Stutzman-Engwall^l al, J. Bacterial, 174:144 (1992)] (SEQ ID 

35 NO:23), the prgl gene of Streptomyces alboniger [Lacalle et al, EMBO J., 1 1 :785 (1992)] 
(SEQ ID NO:24), and the strS gene of Streptomyces griseus pistler et aL, Gene, 1 15:105 
(1992)](SEQIDNO:25). 
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FIG. 5D illustrates the amino acid sequence identity between the sugar biosynthesis 
enzymes encoded by the eryBV and eryCIII genes of Sac. erythraea (SEQ ID NO:7 and SEQ 
ID NO:4 respectively) and the sugar biosynthesis enzyme encoded by the dnrS gene of 
s Streptomyces peucetius [Ouen et aL, J. Bacterial., 177:6688 (1995)] (SEQ ID NO:26). 

FIG. SE illustrates the amino acid sequence identity between the sugar biosynthesis 
enzyme encoded by the eryCVI gene of Sac. erythraea (SEQ ID N0:8) and the sugar 

biosynthesis enzymes encoded by the srmX gene of Streptomyces ambofaciens [Geistlich et 
10 a/., Mol Microbiol., 6:2019 (1992)] (SEQ ID NO:27), the rdmD gene of Streptomyces 
purpurascens [GenBank Accession: U10405] (SEQ ID NO:28) and the glycine 
mcthyltransferase of Rattus norvegious [Ogawa et aL, Eur. J. Biochem. 168:141 (1987)1 
(SEQIDNO:29). 

IS FIG. 6A through 6D illustrate the compounds conceivably formed in Examples 1-4 

respectively and are representative of compounds formed from Type I (FIG 6A), Type II 
(FIG. 6B), and Type ID (FIGS. 6C and 6D) alterations. 

FIG. 7 illustrates the construction of the expression plasmid pASX2 described in 
20 Example 2. For FIGS 7-13 the following abbreviations have been used: amp, ampicillin 

resistance gene; tsr, thiostrcpton resistance gene; ROP, repressor of plasmid synthesis gene; 

eryBI, eryBII, eryBIII, eryBIV, eryBV, eryBVI, eryBVJI, eryCI, eryCII, eryCIII, eryOV, 

eryCV, and eryCVI, the erythromycin biosynthetic genes involved in the synthesis of 

mycarose or its attachment to the macrolide ring (eryB) or the synthesis of desosamine or its 
25 attachment to the macrolide ring (eryQ [the thin airows above a gene indicate its relative size 

and the direction of transcription]; ori-£. coli^ aii origin of DNA replication that functions in 

E. coli, in the specific examples the ColEl origin; on-Streptomyces, an origin of DNA 

replication that functions in Streptomyces, in the specific examples the pJVl origin [Servin- 

Gonzalez et al. Microbiology, 141 :2499 (1995)]; p-ennfi* a modified promoter for the ^ 
30 erythromycin resistance gene; t-fd, the gene VIII transcription terminator of bacteriophage fd; 
PCR, polymerase chain reaction. Restriction enzyme sites have been indicated by their 
standard conunercial names (i.e. £amHI, £coRI, etc). The abbreviations appended to the 
large arrows in the plasmid synthetic schemes summarize each of the steps involved the 
plasmid constructions. TTiese steps are described fully in the relevant Examples. 

35 

FIG. 8 illustrates the construction of the eryBVIl antisense expression plasmid 
pASBVII described in Example 2. 
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FIG. 9A illustrates the construction of the carrier plasmid pKl . 

FIG. 9B-E illustrates the construction of plasmid pKB6 which carries all of the eryB 
5 genes and is described in Example 3. 

FIG. 10 illustrates the construction of expression plasmid pXl described in Example 

10 FIG. 1 1 illustrates the construction of the eryB expression plasmids pXSB6 and pXB6 

described in Example 3. 

, FIG. 12A-B illustrate the construction of plasmid pKC4 which carries all of the eryC 
genes described in Example 4. 

15 

FIG. 1 3 illustrates the construction of the eryC expression plasmids pXSC4 and pXC4 
described in Example 4. 

Pgtailfid Pescription of the Invcntipp 

20 1. The Invention 

The present invention provides isolated and purified polynucleotides that encode 
enzymes or fragments thereof responsible for the biosynthesis of polyketide-associated sugars 
or their attachment to polyketides, vectors containing those polynucleotides, host cells 
transformed with those vectors, a process of making novel glycosylated polyketides using 

25 those polynucleotides and vectors, and isolated and purified recombinant polypeptides and 
polypeptide fragments thereof. 

11. Definitions 

For the purposes of the present invention as disclosed and claimed herein, the 
30 following terms are defined. 

The term "polyketide" as used herein refers to a large and diverse class of natural 
products, including but not limited to antibiotic, antifungal, anticancer, and anti-helminthic 
compounds. Antibiotics include, but are not limited to anthracyclines and macrolides of 
different types (polyenes and avermectlns as well as classical macrolides such as 
35 erythromycins). 

The term "glycosylated polyketide" refers to any polyketide that contains one or more 
sugar residues. 
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The term "glycosylation-modified polyketide" refers to a polyketidc having a changed 
glycosylation pattern or configuration relative to that particular polyketide's unmodified or 
native state. 

The term "polyketide-producing microorganism** as used herein includes any 
5 microorganism that can produce a polyketide naturally or after being suitably engineered (i.e. 
genetically). Examples of actinomycetes and the polyketides they naturally produce include 
but are not limited to those listed in Table 1 below (see Hopwood, D.A. and Sherman, D.H., 
Amu. Rev. Genets 24:37-66 (1990) incorporated herein by reference). 

10 Table 1 



Organism 


PoWketide Produced 


Saccharopolyspora erythraea 


Erythromycin 


Streptomyces ambofaciens 


Spiramycin 


Sireptomyces avermitilb 


Avermectin 


Streptomyces fradiae 


Tylosin 


Streptomyces griseus 


Candicidin, monacdn, griseusin 


Streptomyces violaceoniger 


Granaticin 


Streptomyces thermotolerans 


Carbomydn 


Streptomyces rimosus 


Oxytetracycline 


Streptomyces peucetius 


Daunorubicin 


Streptomyces coelicolor 


Actinorhodin 


Streptomyces glaucescens 


Tetraccnomycin 


Streptomyces roseojulvus 


Frenolicin 


Streptomyces cirmamonensis 


Monensin 


Streptomyces cufacoi 


Curamycin 


Amycoiatopsls mediterranei 


Rifamycin 



Other examples of polyketide-producing nucroorganisms that produce polyketides 
naturally include various Actinomadura , Dactylosporangium and Nocardia strains. 

The term "sugar biosynthesis genes" as used herein refers to sequences of DNA from 
15 Saccharopolyspora erythraea that encode sugar biosynthesis enzymes and is intended to 
include sequences of DNA from other polyketide-producing microorganisms which are 
identical or analogous to those obtained from Saccharopolyspora erythraea. 

The term "sugar biosynthesis enzymes" as used herein refers to polypeptides which 
are involved in the biosynthesis and/or attachment of polyketide-associated sugars and their 
20 derivatives and intermediates. 
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The term "polyketide-associated sugar" refers to a sugar that is known to attach to 
polyketides or that can be attached to polyketides by the processes described herein. 

The term "'sugar derivative" refers to a sugar which is naturally associated with a 
polyketide but which is altered relative to the unmodified or native state; examples only 
5 include N-3-a-desdimethyl D-desosamine, D-mycarose, 4-keto-L-mycarose, 4-kcto-D- 
mycarose, 3-desmethyI L-mycarose and 3-de$methyI D-mycarose. 

The term "sugar intermediate" refers to an intermediate compound produced in a 
sugar biosynthesis pathway. ^ 

The term "eryB" as used herein refers to sequences of DNA that encode enzymes 
10 involved specifically in the biosynthesis of the deoxysugar L-mycarose. 

The term "eryC as used herein refers to sequences of DNA that encode enzymes 
involved specifically in the biosynthesis of the deoxysugar D-desosamine, 

m. PQlynu<?leQtide$ 

15 The organization of the segment of the Saccharopolyspora erythraea (Sac. erythraea) 

chromosome that determines the biosynthesis of erythromycin and the corresponding genes 
that determine the biosynthesis of the sugars L-mycarose and E>-desosamine, designated 
eryB and eryC, respectively, are shown in FIG. 1 A. It is seen that several genes are required 
for the biosynthesis of each of the sugars and that these genes are interspersed among one 

20 another. It is predicted that each gene encodes an enzyme that catalyzes one or a few steps in 
the biosynthesis of L-mycarose or D-desosamine from thymidine diphospho-4-keto-6 
deoxyglucose (TDP-glucose); these steps are outlined in FIG. 2 and FIG. 3. In the case of L- 
mycarose, (shown in FIG. 2), these steps include: (1) C-2" deoxygenation , (2) C-27C-3" 
enoyl reduction, (3) C-5" epimerization, (4) C-3" C-methylation, (5) C-4" keto reduction, and 

25 (6) transfer to erythronolide B. For D-desosamine, shown in FIG. 3, these steps comprise (I) 
C-A'/y isomerization, (2, 3) C-3* deoxygenation and reduction, (4) C-3' amination, 
(5, 6) N-3a* N-dimethylation. and transfer to mycarosyl erythronolide B. 

This classification of genes (as belonging to either the eryB class or eryC class) was 
determined by first altering the wild type genes of interest in an erythromycin producing 

30 strain (i.e. in vivo) to inactivate their expression. The erythromycin products resulting from 
such alterations were then analyzed. Genes whose alterations caused an accumulation of 
erythronolide B (mdicating a lack of L-mycarose, or failure to attach L-mycarose to the 
erythronolide ring) were classified as eryB genes; genes whose alterations caused an 
accumulation of 3-a-L-mycarosyl erythronolide B (indicating a lack of D-desosamine, or 

35 failure to attach D-desosamine to the 3-a-L-mycarosyl erythronolide B ring) were classified 
as eryC genes. Accordingly, it should be noted that all such genes identified herein as eryB 
or eryC are involved in the syndesis of L-mycarose or D-desosamine. The predicted 
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functional activities of. the polypeptides encoded by eryB and eryC will be discussed in 
further detail below. 

In one aspect then, the present invention provides isolated and purified eryB and eryC 
polynucleotides from Sac. erythraea that encode enzymes involved in the production of 

5 glycosylated polyketides. A polynucleotide of the present invention that encodes a sugar 
biosynthesis enzyme is an isolated single or double stranded polynucleotide having a 
nucleotide sequence which comprises (a) a nucleotide sequence selected from the group 
consisting of (i) the sense sequence of FIG. 4A (SEQ ID NO:l) from about nucleotide 
position 54 to about nucleotide position 1 136; (ii) the sense sequence of SEQ ID NO: I from 

10 about nucleotide position 1 147 to about nucleotide position 2412; (iii) the sense sequence of 
SEQ ID N0:1 from about nucleotide position 2409 to about nucleotide position 3410 ; (iv) 
the sense sequence of FIG. 4B (SEQ ID N0:2) from about nucleotide position 80 to about 
nucleotide position 1048; (v) the sense sequence of SEQ ID N0:2 from about nucleotide 
position 1048 to about nucleotide position 2295; (vi) the sense sequence of SEQ ID N0:2 

15 from about nucleotide position 2348 to about nucleotide position 3061 ; (vii) the sense 

sequence of SEQ ID NO:2 from about nucleotide position 3214 to about nucleotide position 
4677; (viii) the sense sequence of SEQ ID N0:2 from about nucleotide position 4674 to 
about nucleotide position 5879; (ix) the sense sequence of SEQ ID N0:2 from about 
nucleotide position 5917 to about nucleotide position 7386; and (x) the sense sequence of 

20 SEQ ID N0:2 from about nucleotide position 74 1 5 to about nucleotide position 7996; 

(b) sequences complementary to the sequences of (a), 

(c) sequences that, when expressed, encode polypeptides encoded by the sequences of 
(a), and 

(d) analogous sequences that hybridize under stringent conditions to the sequences of 

25 (a). 

A preferred polynucleotide is a DNA molecule. In another embodiment, Ae polynucleotide 
is an RNA molecule. 

The nucleotide sequence and deduced amino acid residue sequences of the sugar 
biosynthesis genes are set forth in HG. 4A(M) and FIG. 4B(l-9). The nucleotide sequences 

30 of RG. 4A(M) (SEQ ID N0:1) and HG. 4B(l-9) (SEQ ID N0:2) represent full length DNA 
clones of the sense strand of two distinct clusters of sugar biosynthesis genes and are 
intended to represent both the sense strand (shown on top) and its complement. The amino 
acid sequences depicted below the sense strand correspond to polypeptides encoded by a 
nucleotide sequence selected from the group consisting of (i) the sense strand of SEQ ID 

35 NO: 1 from about nucleotide position 54 to about nucleotide position 1 136 (ii) the sense 
sequence of SEQ ID NO:l from about nucleotide position 1 147 to about nucleotide position 
2412, (iii) the sense sequence of SEQ ID N0:1 from afcout nucleotide position 2409 to about 
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nucleotide position 3410, (iv) the sense sequence of SEQ ID N0:2 from about nucleotide 
position 80 to about nucleotide position 1048. (v) the sense sequence of SEQ ID N0:2 from 
about nucleotide position 1048 to about nucleotide position 2295, (vi) the sense sequence of 
SEQ ID NO:2 from about nucleotide position 2348 to about nucleotide position 3061 , (vii) 

5 the sense sequence of SEQ ID N0:2 from about nucleotide position 3214 to about nucleotide 
position 4677, (ix) the sense sequence of SEQ ID N0:2 from about nucleotide position S917 
to about nucleotide position 7386 and (x) the sense sequence of SEQ ID NO:2 from about 
nucleotide position 7415 to about nucleotide position 7996. The polypeptides encoded by the 
nucleotide sequences of (i)-(x) above are set forth as SEQ ID NO:3-SEQ ID NO; 12 

10 respectively. 

The present invention also contemplates analogous DNA sequences which hybridize 
under stringent hybridization conditions to the DNA sequences set forth above. Stringent 
hybridization conditions are well known in the art and define a degree of sequence identity 
greater than about 80%-90%. The modifier "analogous*' refers to those nucleotide sequences 

IS that encode analogous polypeptides (i.e. in relation to a sugar biosynthesis enzyme), 
analogous polypeptides being those which have only conservative differences and which 
retain the conventional characteristics and activities of sugar biosynthesis enzymes. (A more 
detailed description of analogous polypeptides is provided below). The present invention 
also contemplates naturally occurring allelic variations and mutations of the DNA sequences 

20 set forth above so long as those variations and mutations code, on expression, for a sugar 
biosynthesis gene of this invention as set forth hereinafter. 

As is well known in the art, because of the degeneracy of the genetic code, there are 
numerous other DNA and RNA molecules that can code for the same polypeptides as those 
encoded by the aforementioned sugar biosynthesis gehes and fragments thereof. The present 

25 invention, therefore, contemplates those other DNA and RNA molecules which, on 

expression, encode the polypeptides of SEQ H> NO:3-SEQ ID NO: 1 1 or fragments thereof. 
Having identified the amino acid residue sequence encoded by a sugar biosynthesis gene, and 
with knowledge of all uiplet codons for each particular amino acid residue, it is possible to 
describe all such encoding RNA and DNA sequences. DNA and RNA molecules other than 

30 those specifically disclosed herein and, which molecules are characterized simply by a 
change in a codon for a particular amino acid, are within the scope of this invention. 

The 20 common amino acids and their representative abbreviations, symbols and 
codons are well known in the art (see for example. Molecular Biology of the Cell, Second 
Edition, B. Alberts et a/.. Garland Publishing Inc., New York and London, 1989). As is also 

35 well known in the art, codons constitute triplet sequences of nucleotides in mRNA molecules 
and as such, are characterized by the base uracil (U) in place of base thymidine (T) which is 
present in DNA molecules. A simple change in a codon for the same amino acid residue 
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within a polynucleotide will not change the structure of the encoded polypeptide. By way of 
example, it can be seen from SEQ ID NO: 1 that an AGC codon for serine exists at nucleotide 
positions 126-128 and again at positions 420-422 and 561-563. However, it can also be seen 
from that same sequence that serine can be encoded by a TCG codon (see eg. nucleotide 
positions 192-194) and a TCC codon (see e.g.» nucleotide positions 204-206). Substitution of 
the latter codons for serine with the AGC codon for serine, or visa versa, does not 
substantially alter the DNA sequence of SEQ ID N0:1 and results in production of the same 
polypeptide. In a similar manner, substitutions of the recited codons with other equivalent 
codons can be made in a like manner without departing from the scope of the present 
invention. 

A polynucleotide of the present invention can also be an RNA molecule. An RNA 
molecule contemplated by the present invention is complementary to or hybridizes under 
stringent conditions to any of the DNA sequences set forth above. Exemplary and preferred 
RNA molecules are mRNA nK)lecules that encode sugar biosynthesis enzymes of this 
invention. 

IV. pQlypgptid^ 

In another aspect, the present invention provides polypeptides which arc reasonably 
believed to be sugar biosynthesis enzymes. A sugar biosynthesis enzyme of the present 
invention is a polypeptide of about 21 kdal to about 47 kdal. As set forth in FIG. 5A-5E, 
analogs of the predicted polypeptides encoded by certain eryB and eryC genes have been 
identified in various species and their sequences compared using the PRETTY routine 
(Genetics Computer Group (GCG) Sequence Analysis Software Package, Madison, WI). 
Due to the degree of amino acid sequence identity existing between the polypeptides of these 
other sugar biosynthesis genes and the polypeptides encoded by the eryB and eryC genes, 
certain enzymatic activities can reasonably be attributed to the eryB and e/yC polypeptides. 

By way of example, analogs of the polypeptide encoded by the eryS/V gene have 
been identified in Yersinia pseudotuberculosis Salmonella enterica, Streptomyces griseus and 
Escherichia coli (see FIG. 5A). The various analogs have been identified with from 290-328 
amino acid residues and are characterized by a low degree of amino acid sequence identity. 
(For example, the identity between the sugar biosynthesis enzyme encoded by the eryBIV 
gene of Sac. erythraea and the sugar biosynthesis enzyme encoded by the galE gene of £. 
coli is 20% at the amino acid level). However, a conserved amino acid sequence motif, G x x 
G x X G (where G represents the amino acid glycine and x represents any other amino acid 
residue) is found within the first 30 amino acid residues of all analogs shown. Since the 
polypeptide encoded by the galE gene has been shown to be an epimerase (whose mechanism 
includes a ketorcduction (Bauer et al. Proteins 12:372 (1992)), the eryBIV gene product is 
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reasonably predicted to be a ketoreductase. 

As set forth in FIG. 5B analogs of the sugar biosynthesis enzyme encoded by the 
eryBVII gene have been identified in Sireptomyces griseus Salmonella enterica. Yersinia 
entercolitica and Yersinia pseudotuberculosis. The various analogs have been identified with 

5 from 1 83-200 amino acid residues and are characterized by a moderate degree of amino acid 
identity. By way of example, the identity at the amino acid level between the sugar 
biosynthesis enzyme encoded by the eryBVII gene of Sac. erythraea and the sugar 
biosynthesis enzyme encoded by the rfbC gene of Salmonella enterica or the strM gene of 
Streptomyces griseus is 37% and 61 %, respectively. Furthennore, a common characteristic 

10 of these particular polypeptides (including that of eryBVII), is that they are only associated 
with L-sugar biosynthesis and not with D-sugar biosynthesis. Thus the gene product of 
eryBVII is reasonably predicted to function as a C-5 epimerase which converts the 
stereochemistry of the sugar from the "D" configuration to the "L" configuration. 

As set forth in FIG. SC analogs of the sugar biosynthesis enzyme encoded by the 

]S eryCIV gene have been identified in Sac. erythraea and Yersinia pseudotuberculosis. As set 
forth in FIG. SC, the predicted amino acid sequences of the protein products of eryCI and 
eryCIV share 34% sequence identity to each other, 27% and 25% respectively to the 
predicted amino acid sequence encoded by ascC from Yersinia pseudotuberculosis. The 
enzyme encoded by ascC has been shown to remove a hydroxyl group located at the C-3 

20 position of L-ascarylose (Liu and Thorson, Amu, Rev. Microbiol 48:223 (1994)), Thus, at 
least one of the polypeptides encoded by eryCI or eryCIV is predicted to be an enzyme which 
functions in deoxygenation reactions. 

Furthermore, the enzyme encoded by the ascCgtnt requires the biochemical cofactor 
pyridoxamine, which is the same cofactor used in biochemical transamination reactions. 

25 Consequently, it has been proposed that some protein analogs (such as dnrJ from 

Streptomyces peucetius, prgl from Streptomyces albpniger md strs from Streptomyces 
griseus) having a moderate degree of sequence similarity to the polypeptide encoded by dscC 
function as transaminases in amino sugar biosynthesis (Thorson et a/., J. Am. Chem. Soc. 
1 15:6993 (1993)). Since the biosynthesis of D-desosamine requires both deoxygenation and 

30 transamination, it is reasonable to predict that at least one of the polypeptides encoded by the 
eryCI or eryCIV genes fixnctions in transamination reactions. 

As set forth in FIG. 5D the predicted polypeptides encoded by eryBV and eryCIII 
share 43% identity at the amino acid level and as such, may be assumed to have similar 
activities with respect to their particular sugars. However, as shown in FIGS. 2 and 3, there 

35 are no common steps in the proposed pathways of L-mycarose and D-desosaiiiine 

biosynthesis. Rather than having similar sugar biosynthesis functions, these polypeptides are 
predicted to be nucleotidyl-sugar transferases which, (in Sac. erythraea at least), function to 
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attach L-mycarosc and D-desosamine to erythronolide B and 3-a-mycarosyIerythronolide B, 
respectively. 

As set forth in FIG. 5E analogs of the polypeptide encoded by the eryCVI gene have 
been identified in Streptomyces ambofaciens, Streptomyces purpurascens, and Rattus 
norvegicus. The various analogs have been identified with from 237-293 amino acid residues 
and are characterized by a low to moderate degree of amino acid identity. By way of 
example, the identity between the polypeptide encoded by the eryCVI gene of Sac. erythraea 
and the glycine methyltransferase of Rattus norvegicus is 26% at the amino acid level. 
Furthermore these sugar biosynthesis enzymes share a conamon sequence motif, 
LDVACGTG (SEQ ID NO:30 = anuno acid positions 64-71 in the consensus sequence in 
FIG. 5E). with rat glycine methyltransferase whose biochemical function is Icnown (Ogawa et 
al., Eur, 7. Biochem. 168:141 (1987)). Thus these polypeptides are predicted to be 
methyltransferases. 

In another aspect, the present invention provides a recombinant C-4" keto reductase 
from Sac. erythraea . A recombinant Sac. erythraea kctorcductase of the present 
invention is a polypeptide of about 322 or less amino acid residues. A preferred recombinant 
Sac. erythraea C-4** ketoreductasc is that encoded by the nucleotide sequence of SEQ ID 
N0:2 from about nucleotide posidon 80 to about nucleotide position 1048. 

The present invention also contemplates anuno acid residue sequences that are 
substantially duplicative of the sequences set forth herein such that those sequences 
demonstrate like biological activity to disclosed sequences. Such contemplated sequences 
include those analogous sequences characterized by a minimal change in amino acid residue 
sequence or type (e.g.. conservatively substituted sequences) which insubstantial change does 
not alter the fundamental nature and biological activity of the aforementioned sugar 
biosynthesis enzymes. 

It is well known in the art that modifications and changes can be made in the structure 
of a polypeptide without substantially altering the biological function of that peptide. For 
example, certain amino acids can be substituted for other amino acids in a given polypeptide 
without any appreciable loss of function. In making such changes, substitutions of like amino 
acid residues can be made on the basis of relative similarity of side-chain substituents, for 
example, their size, charge, hydrophobicity, hydrophilicity, and the like. 

As detailed in United States Patent No. 4,554,101 , incorporated herein by reference, 
the following hydrophilicity values have been assigned to amino acid residues: Arg (+3.0); 
Lys (+3.0); Asp (+3.0); Glu (+3.0); Ser (+0.3); Asn (+0.2); Gin (+0.2); Gly (0); Pro (-0.5); 
Thr (-0.4); Ala (-0.5); His (-0.5); Cys (-1.0); Met (-1.3); Val (-1.5); Uu (-1 .8); He (-1.8); Tyr 
(-2.3); Phe (-2.5); and Trp (-3.4). It is understood that an amino acid residue can be 
substituted for another having a similar hydrophilicity value (e.g., within a value of plus or 
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minus 2.0) and still obtain a biologically equivalent polyi^eptide. 

In a similar manner, substitutions can be made on the basis of similarity in 
hydropathic index. Each amino acid residue has been assigned a hydropathic index on the 
basis of its hydrophobicity and charge characteristics. Those hydropathic index values are: 
5 lie (+4.5); Val (+4.2); Leu (+3.8); Phe (+2.8); Cys (+2.5); Met (+1 .9); Ala (+1 .8); Gly (-0.4); 
Thr (-0.7); Ser (-0.8); Trp (-0.9); Tyr (-1.3); Pro (-1.6); His (-3.2); Glu (-3.5); Gin (-3.5); Asp 
(-3.5); Asn (-3.5); Lys (-3.9); and Arg (-4.5). In making a substitution based on the 
hydropathic index, a value of within plus or minus 2.0 is preferred. 

10 V. Production of novel glycosylated polvketides 

In another aspect, the present invention comprises a general procedure for producing 
novel polyketide structures in vivo by selectively altering, inactivating, or augmenting the 
genetic information of the organism that naturally produces a related polyketide. That is, in 
the present in vention, novel polyketides of desired structure are produced by manipulation of 

15 the eryB and/or eryC genes followed by their introduction into various polyketide-producing 
microorganisms. These manipulations result in the formation of "glycosylation-modified'* 
polyketides (i.e. polyketides having an altered glycosylation pattern or configuration relative 
to their native state). For example, "glycosylation-modified" polyketides are those which 
have additional sugar groups attached (where none previously existed), different sugars (such 

20 as sugar intermediates) attached in place of the natural sugars or lack sugar groups (at 
positions where sugar groups previously existed). 

In the case of type I and Type II alterations (further described below) glycosylation- 
modified polyketides may arise though mechanisms which cause either (1) the non- 
production of the sugar attachment enzyme (i.e. the enzyme involved in attachment of a sugar 

25 to the the polyketide structure) or (2) the non-production of a sugar biosynthesis enzyme. In 
the first instance, the sugar will not be attached to the polyketide since the enzyme which 
functions to attach the sugar will be lacking. In the second situation, a sugar intermediate 
from the biosynthesis pathway will be produced (depending on which enzyme is lacking) and 
attached to the polyketide provided it is recognized as a suitable substrate by the sugar 

30 attachment enzyme; alternatively, it will not be recognized and therefore, not attached. In the 
case of Type m alterations (also described in detail below), glycosylation-modified 
polyketides arise via attachment of additional or different sugars (i.e. not nomudly found in a 
particular polyketide-producing strain) to the polyketide. It should be noted, that these 
postulated nnechanisms are simply provided to enhance understanding of the novel processes 

35 described herein; the actual mechanisms by which the Type I« II and III alterations produce 
glycosylation-modified polyketides is not presently known. 

In the first type of alteration (referred to herein as Type I alterations), genetically 
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altered eryB and/or eryC genes are introduced into the chromosome of Sac. erythraea or 
another glycosylated polyketide-producing organism that also produces L-mycarose, D- 
desosamine, or their closely related derivatives such as mycaminose (4-hydroxy D- 
desosamine). The genetic alteration of an eryB and/or eryC gene is such that it causes a non- 
functional enzyme to be synthesized. Once introduced into an appropriate strain, the altered 
gene replaces its corresponding wild type gene causing the strain to lose the ability to 
produce a particular enzymatic activity involved in sugar biosynthesis. As a result, a 
glycosylalion-modified polyketide is jwoduced via either of the mechanisms previously 
described for a Type I alteration. 

In a Type I change described herein, a specific mutation in an eryB and/or eryC gene 
of the Sac. erythraea chromosome is accomplished by a three step process which involves: 
1) specifically altering the DNA sequence of a desired sugar biosynthesis gene, 2) subcloning 
the altered sequence into a suitable vector capable of recombining in the chromosome of an 
appropriate host and 3) introducing the vector containing the subcloned sequence into the 
appropriate host so that exchange of the wild type allele with the mutated one will occur. The 
first step is accomplished using standard recombinant DNA techniques to effect a deletion, 
base pair conversion or frame-shift in the DNA sequence. The second step, which also 
employs standard recombinant techniques, involves subcloning the altered sequence into a 
vector which does not replicate in Sac. erythraea or the desired host. In the final step, the 
vector is introduced into a suitable host, where by the process of gene replacement, the 
altered allele replaces the wild-type one. All techniques employed in a Type I change are 
well known to those of ordinary skill in the art. 

Example 1 illustrates the process of gene replacement of an eryB gene. As Example 1 
shows, the eryB gene of interest is mutated and along with adjacent upstream and 
downstteam DNA sequences, cloned into a non-rcplicating Sac. erythraea plasmid vector. 
The vector carrying the mutated allele and adjoining DNA is then introduced into the host 
strain by the process of protoplast transformaUon. transformants are regenerated under 
selective conditions (i.e. conditions that require expression of a particular plasmid marker) in 
order to induce recombination of the plasmid into the host cell chromosome. In other words, 
since the plasmid does not replicate autonomously, it must reside in the chromosome to be 
maintained in the cell and to express a particular marker under selective conditions. Insertion 
is achieved when the regenerated cells undergo a single homologous recombination between 
one of the two DNA segments that flank the mutation on the plasmid and its homologous 
counterpart in the chromosome. The cells are then grown without selection for the marker 
which induces plasmid loss from the chromosome. This loss arises after the cells have 
undergone a second recombination between the second DNA segment that flanks the 
mutation and its homologous chromosomal counterpart. This second rccombinational event 
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results in the loss of the plasmid sequences and the wild type allele from the chromosome; the 
mutant allele however is retained. 

In a variation of a Type I change, the non-production of the sugar biosynthesis 
enzyme (or attachment enzynnie) may be achieved by the alternative mechanisms of promoter 

s inactivation and/or transcriptional terminator insertion. These variations do not effect the 
gene sequence itself but rather regulatory mechanisms involved in gene transcription. 
^'Promoter'* as used herein refers to that region of a DNA molecule which controls the 
initiation of RNA transcription. Such regions are known to bind RNA polymerases (i.e. the 
enzymes involved in synthesizing RNA molecules). This form of Type I change (i.e. 

10 promoter inactivation) involves two steps of 1) identifying the promoter region of the desired 
gene and 2) rendering the promoter region inoperable by mutation. As in the replacement 
mechanism described above such mutations may be effected by creating deletions in the 
promoter sequence or by base pair conversion. In the case where the promoter controls 
transcription of a single gene, inactivation of the promoter will eliminate expression of that 

15 particular gene; of course, where the promoter controls expression of an entire operon (i.e. a 
series of genes whose expression is controlled by a single promoter), promoter inactivation 
will effectively eliminate expression of all genes in that operon. 

In a similar manner, the non-production of a sugar biosynthesis enzyme (or 
attachment enzyme) may arise from inserting a transcriptional terminator upstream from the 

20 gene to be inactivated. A **transcriptional terminator" as used herein is a nucleotide sequence 
which signals RNA polymerase to cease transcription. An example of a transcriptional 
terminator is a palindromic sequence capable of forming a stem-loop structure that is 
followed by a stretch of U residues (for example the transcriptional terminator that follows 
gene VIII of bacteriophage fd (Beck and Zink, Gene, 16:35 (1981)). Effecting a change in 

25 production of a sugar biosynthesis gene by this process involves 1) identifying of the gene or 
genes of interest (in the case of an operon arrangement) to be inactivated and 2) cloning a 
transcriptional terminator sequence in a region of the DNA upstream from such gene(s). A 
transcriptional terminator will cause the polymerase involved in RNA transcription to stop (at 
or near the signaling region) thereby preventing transcription of any do>ynstream sequences. 

30 Thus, changes such as promoter inactivation and transcriptional insertion, which directly 
effect expression of sugar biosynthesis genes are also intended to be within the scope of the 
invention. 

In the second case (referred to herein as Type n alterations) eryB and/or eryC genes 
are arranged on a vector in an antisense orientation relative to a promoter capable of allowing 
35 expression of the gene in Sac. erythraea or Streptomyces. The vector is then introduced into 
a polyketide producing microorganism. As a result of this vector construction, antisense 
messenger RNA (mRNA) is produced which interferes with the translation of the wild-type 
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mRNA. Similarly to the Type I manipulation, novel glycosylation modified polykctides will 
be produced in which the normal mycarose, desosamine, and/or closely related sugar residue 
is lacking or is substituted by a sugar intermediate. 

In a Type 11 change^ inactivation of the eryB and/or eryC genes by antisense 

5 expression is accomplished by a two step procedure in which (1) a specific sugar biosynthesis 
gene is subcloned into an expression vector in an antisense (i.e. reverse) orientation; and (2) 
the anti-sense expression vector is introduced into the desired strain. The first step is 
accomplished using standard recombinant DNA techniques employing either £. coli or 
Streptomyces as the host, and an expression vector (capable of replicating in either host) that 

10 can be assembled to contain a Streptomyces promoter. Streptomyces promoters may be 
obtained from any commercially available Streptomyces plasmids or Streptomyces- E. coli 
shuttle plasmids. In step 2, the anti-sense expression vector is introduced into a suitable 
Streptomyces strain and the transformed cells are grown under selective conditions in order to 
maintain the expression palsmid in the cell. 

IS As described in Example 2, the gene to be inactivated is subcloned in its reverse 

orientation downstream of a Streptomyces promoter (which is contained within a replicaiing 
Sac. erythraea plasmid). The plasmid canying the antisense gene is then introduced into the 
host strain by protoplast transformation. Transformants are regenerated under selective 
conditions in order to maintain the autonomously replicating plasmid in the cells. Subsequent 

20 expression of the antisense gene causes the production of an antisense messenger RNA 
(mRNA) that is complementary to the mRNA of the native allele of the selected gene. 
Through standard nucleotide base pair interactions, the antisense mRNA and the native 
mRNA form an RNA duplex that occludes the ribosome binding site of the native mRNA. 
This interaction prevents ribosomal translation of the native mRNA and the corresponding 

25 synthesis of the enzyme encoded by that mRNA. In this way, specific enzymatic steps in 
sugar biosynthesis corresponding to the identity of the gene expressed in the antisense 
orientation are blocked leading to the production of novel sugar intermediates which, Avhen 
attached to the polyketide ring of the host microorganism, give rise to novel glycosylation- 
modified polykctides. Alternatively, the antisense expression vector can be constmcted using 

30 a non-replicating Sac, erythraea vector that includes flanking DNA from a nonessential 

region of the Sac. erythraea chromosome, such as the region immediately upstream from the 
eryK gene (FIG. 1). This vector can then be used to stably insert the antisense construction 
into the chromosome by homologous recombination in a fashion similar to that described for 
the construction of a Type I alteration. 

35 In the third case (refened to herein as Type III alterations), novel glycosylation- 

modified polykctides of desired structure are produced by arranging all or a subset of the 
eryB and/or eryC genes on a replicating vector and introducing these genes en bloc into a 
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"distinct" polyketide-producing organism, ie. one other than the microorganism from which 
the eryB and/or eryC genes were taken. As an example, eryB and/or eryC genes may be 
taken from Sac. erythreae and introduced into Streptomyces violaceoniger or Streptomyces 
venezuelae. In this case, mycarose, desosamine, their biochemical intermediates and/or their 
s closely related derivatives will be synthesized and attached at specific positions to polyketide 
compounds that do not necessarily cany these, or any, sugar residues. Some examples of 
novel glycosylated polykettdes that may be produced in hosts that carry such manipulations 
are shown in FIG. 6. 

In Type III changes, the genes for the biosynthesis of mycarose and/or desosamine are 
10 introduced into a polyketide-producing organism other than Sac. erythraea by another simple 
two step procedure: 1) all or a subset of the eryB and/or eryC genes arc assembled together on 
a replicating plasmid downstream of a Streptomyces promoter; and 2) the plasmid is 
introduced into the polyketide-producing organism. Step I requires standard recombinant 
DNA manipulations employing £. coli and/or Streptomyces as the host. Step 2 requires one 
1 5 or more plasmids out of the several Streptomyces vectors or E. colUStreptoihyces shuttle 
vectors available, one or more promoters that function in Streptomyces, and a selection for 
the presence of the strain carrying the plasmid. As described in Examples 3 and 4, sets of the 
eryB and/or eryC genes are sequentially subcloned together on a replicating vector 
downstream of a suitable promoter that functions in the desired host. The plasmid carrying 
20 the grouped genes is then introduced into the host strain by electroporation or by 
transformation of protoplasts employing selection for a plasmid marker. 

GENERAL METHODS 

75 Materials. Plasmids, and Bacterial Strains 

Restriction endonucleases, T4 DNA ligase, competent E. ca/r DH5a cells, X^gal, 
IPTG and plasmids pUC18, pUC19, and pBR322 were purchased from Bethesda Research 
Laboratories (BRL), Gaithersburg, MD. VentR® DNA polymerase was purchased from New 

30 England Biolabs (Beverly, MA). Plasmids pGEM®5Zf, pGEM®7Zf, and pGEM® 1 IZf were 
from Promega, Madison, WI, plasmids pU4070 and pIJ702 were obtained from the John 
Innes Institute, Norwich, England, and plasmids pWHM3 and pWHM4 (71 Bacterial 1989 
171 :S872) were obtained from C. R. Hutchinson, University of Wisconsin, Madison. WI. 
[a-32p]dCTP, Hybond™-N nylon membranes, and Megaprime nick translation kits were 

35 from Amersham Corp., Chicago, IL. SeaKem® LE agarose and SeaPlaque® low gelling 
temperature agarose were from FMC Bioproducts, Rockland, ME. £, coli K12 strains 
carrying the £. coli-Sac. erythraea shuttle plasmids pWHM3 and pWHM4 (Vara et al., J 
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Bacterial, 171 :5872 (1989)) and pADC have been deposited at the Agricultural Research 
Culture Collection (NRRL) 1815 N. University Street, Peoria, Illinois 61604, as of 
December 5, 1995, under the terms of the Budapest Treaty and will be maintained for a 
period of thirty (30) years from the date of deposit, or for five (5) years after the last request 
for the deposit, or for the enforceable period of the U.S. patent, whichever is longer. 
Plasmids pWHM3, pWHM4 and pAIX were accorded the accession numbers NRRL B- 
21512. NRRL B-21513 and NRRL B-21514^respectively. Sac. ery^Araca strain NRRL2338 
is also available from the Agricultural Research Service culture collection. Staphylococcus 
aureus Th^ (thiostrepton resistant) was obtained by plating 10^ cells of 5. aureus on agar 
medium containing 10 ^tg/ml thiostrepton and picking a survivor after 48 hr growth at 37'C. 
Thiostrepton was obtained from Sigma Chemical, St. Louis, MO. All other chemicals and 
i^agents were from standard commercial sources unless otherwise specified. 

PNA ManiipvlfttiQns 

Standard conditions were employed fw restriction endonuclease digestion, agarose 
gel-electrophoresis, isolation of DNA fragments from low melting agarose gels, DNA 
ligation, plasmid isolation from £. coli by alkaline lysis, and transformation of E. coli 
employing selection for ampicillin resistance (150 Jlg/ml) on LB agar plates (Sambrook et aL, 
Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Plainview, 
NY, 1989). Total DNA from Sac, erythraea and Streptomyces species (including S.fradiae, 
S. celestes, 5. violaceoniger, S. hygroscopicus, S, venezuelae) was prepared according to 
described procedures (Hopwood et aL, Genetic Manipulation of Streptomyces, A Laboratory 
Manual John Innes Foundation, Norwich, UK (1985)). Transfer of DNA from agarose gels 
to Hybond"™-N membranes and Southern analysis using Megaprime™ nick translated probes 
was performed according to the manufacturers instructions. 

Amplification of DNA Fragments 

Synthetic deoxyoligonucleotides were synthesized on an ABI Model 380A 
synthesizer (Applied Biosystems, Foster City, CA) following the manufacturers 
recommendations. Amplification of DNA fragments was performed by the polymerase chain 
reaction (PCR) using a Perkin Elmer GeneAmp® PGR System 9600. Reactions contained 
100 pmol of each primer, 1 \ig of template DNA (chromosomal DNA from Sac. erythraea 
NRRL2338), 2 units VentR® DNA polymerase in 100 jtl volume of PCR buffer (10 mM KCl, 
•10 mM (NH4)2S04. 20 mM Tris-HCl (pH 8.8, @ 25-C), 2.5 mM MgS04, 0.1% Triton® X- 
100) containing dATP (200 nM). dTTP (200 jiM), dCTP (250 fiM), and dGTP (250 ^M). 
The reaction mixture was subjected to 30 cycles. Bach cycle consisted of one period of 35 
sec at 96*C and one period of 2 min at 72T. The reaction products were visualized and 
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purified from low melting agarose. The PCR primers described in the examples were derived 
from the nucleotide sequence of the eryB and eryC genes of FIG. 4. 

Transformation and Gene Replacement in Sac, ervthraea 

s Protoplasts of Sac. erythraea strains were prepared and transformed with miniprep 

DN A isolated from £. coU according to published procedures (Y amamoto et aL, J 
Antibiotics^ 39:1304 (1986)). Non-integrative transformants, in the case of pWHM4 
derivatives, were selected by regenerating the protoplasts and overlaying with thiostrepton 
(final concentration 20 jig/ml) as described (Weber et al. Gene, 68:173 (1988)). Integrative 

10 transformants, in the case of pWHM3 derivatives, were selected on thiostrepton-containing 
agar plates (15 jig/ml) as described by Weber et al. Gene, 68:173 (1988). Loss of the Th^ 
phenotype was monitored after two rounds of non-selective growth in SGGP media 
(Yamamoto et ai, J Antibiotics, 39:1304 (1986)) followed by protoplasting and serial 
dilution on non-selective agar medial Regenerated protoplasts were replica plated on 

15 thiostrepton-containing media. Th^ (thiostrepton-sensidve) colonies arose at a frequency of 
10~1 . Retention of the mutant allele was established by Southern hybridization of several 
ThS colonies. 

Fermentation 

20 Sac. erythraea or Streptomyces cells are inoculated into 100 ml SCM medium (1 .5% 

soluble starch, 2.0% Difco Soytone, 0.15% Yeast Extract, 0.01 % CaCl2) and allowed to grow 
for 3 to 6 days. The entire culture is then inoculated into 10 liters of fresh SCM medium. 
The fermenter is operated for a period of 4 to 7 days at 32'C maintaining constant aeration 
and pH at 7.0. After the fermentation is complete, the cells are removed by centrifugation at 

25 4'C and the fermentation beer is kept cold until further use. When antibiotic selection to 
maintain a plasmid, such as pXC4 or pXB6, is required, thiostrepton (lO^g/ml) is added to 
both the 100 ml starter culture and the 10-liter fermenter. 

The invention will be better understood in connection with the following examples, 
30 which are intended as an illustration of and not a limitation upon the scope of the invention. 
Both below and throughout the specification, it is intended that citations to the literature be 
expressly incorporated by reference. 

Example 1: Construction and characteriza tion of Sac. er\thraea ERBIV that produces 
35 4"^eoxv-4"-oxQ-ervthromvcin A 



A. Constructio n of Plasmid pRBIV: A 4.3 kb PstVHindim fragment, which included 
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the eryBIV gene, was isolated from the plasmid pAIX5 and subcloned into Pstl-Hindini 
digested pUCl 9 to generate plasmid pUCBIV. After transformation and isolation of the 
plasmid from £. coli, the identity of pUCBIV was confirmed by digestion with Muni which 
released a fragment of 370 bp. Plasmid pUCBIV was then cut with the restriction enzyme 

5 Ncol, the restriction site filled in with Klenow enzyme, and the plasmid religated to generate 
plasmid pNCOBIV, (which now carried a frameshift mutation in the eryBIV gene). After 
transformation and isolation of the plasmid from E. coli, the identity of pNCOBIV was 
confirmed by digestion with Nsil and ifindlll which released a fragment of 1 .59 kb. (The 
Nsil site was formed by the fill-in and religation of the Ncol site.) Fmally, plasmid 

10 pNCOBIV was digested with HindHi and Sstl and the 3.2 kb fragment carrying the altered 
eryBIV gene was isolated and ligated into Hindlll and Sstl digested pWHM3 to generate 
plasmid pRBIV. After transformation and isolation of the plasmid from E. coli, the identity 
of pRBIV was confirmed by digestion with Kpnl which released fragments of 5.2 kb, 4.4 kb, 
and 0.72 kb, 

15 B. Construction of Sac, ervthraea ERBIV : Sac. erythraea protoplasts were 

transformed with plasmid pRBIV and integrative tiansformants selected as described in 
General Methods. Resolution of the integrants by nonselective growth as described in 
General Methods yielded Sac. erythraea ERBTV in which the wild type copy of the eryBIV 
gene was replaced with the inactive mutant copy. Gene replacement was confirmed by 

20 Southern analysis of Ncol digested Sac. erythraea DNA and Ncol-Nsil digested Sac. 

erythraea DNA using the 1.58 kb Ncol-HindTll fragment isolated from plasmid pUCBIV 
(coordinates 68 1-2214, FIG. 4B) as a probe. Wild type Sac. erythraea and wild type 
resolvants display a hybridizing DNA fragment of 2.75 kb when digested with either Ncol or 
Ncol-Nsil, whereas Sac. erythraea strain ERBIV is characterized by hybridization to either a 

25 1 6 kb DNA fragment or a 2.75 kb DNA fragment when digested with Ncol or NcohNsih 
respectively. 

C. Isolation, puri fi cation, and oronerties of 4"-deoxv-4'*'OXO-ervthromYcin A from 
Sac, ervthraea ERBIV : Sac. erythraea strain ERBIV is fermented for 4 days in SCM media 
as described in General Methods. The fermentation broth of Sac. erythraea ERBIV is then 

30 cooled to 4'C and adjusted to pH 4.0 and extracted once with methylene chloride. The 
aqueous layer is readjusted to pH 9.0 and extracted twice with methylene chloride and the 
combined basic methylene chloride extracts are concentrated to a solid residue. This is 
digested in methanol and chromatographed over a column of Sephadex LH-20 in methanol. 
Fractions are tested for bioactivity against a sensitive organism, such as Staphylococcus 

35 aureus Th^. and active fractions are combined. The combined fraictions are concentrated and 
the residue is digested in 1 0 ml of the upper phase of a solvent system consisting pf n- 
heptane, benzene, acetone, isopropanol, 0.05 M, pH 7.0 aqueous phosphate buffer 
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(5:10:3:2:5, v/v/v/v/v), and chromalographed on an Ito Coil Planet Centrifuge in the same 
systen). Active fractions are combined* concentrated and partitioned between methylene 
chloride and dilute ammonium hydroxide (pH 9.0). The methylene chloride layer is 
separated and concentrated to yield the desired product as a white foam. 

Example 2 : Construction and characterization of Sac, ervthraea ER720foASBVm that 
produces 3-a-D-mvcarosvl-S-B-D-desosaminovl-12-hvdroxv-ervthronolidc B 

A. Construction of plasmid pASX2 (see FIG. 71 : The 290 bp EcoRl-flflmffl segment 
carrying the ermE* promoter is isolated from plasmid pIJ4070 and ligated into EcoRl-BamW 
digested pWHM4 DNA to form pASXl. After transformation and isolation of the plasmid 
from £. coU, the identity of pASXl is confirmed by digestion with ApdLl which releases 
fragments of 3.9 kb, 2.5 kb, 1.2 kb, 0.5 kb, and 0.4 kb. Two oligonucleotides of the 
sequences: SEQIDNO:31 (5'-GATCCAGCGTCTGCAGGCATGCTCrAGATACAATTA 
AAGGCTCClll'lGGAGCCrrrri'rrrrGGAGATTTTCAACGT-y) and 

SEQ ID NO:32 (S'-AGCTACGTTGAAAATCTCCAAAAAAAAAGGCTCCAAAA 
GGAGCCTTTAATTGTATCTAGAGCATGCCTGCAGACGCTG-3'). corresponding to the 
(+) and (-) strands of the bacteriophage fd gene VIII o^anscription terminator (t-fd) (Beck et 
al. {\91i)Nuci Acids Res. 5:44951)and including restriction enzyme sites for the enzymes 
Pstl, Sphl, and Xbah and overhanging ends compatible with BamUl and //mdin are 
synthesized and approximately 250 ng of each oligonucleotide are then mixed together in TE 
buffer and heated to 99'C for 1 min. The solution is cooled slowly to room temperature 
allowing the oligonucleotides to anneal due to self complementarity, and the annealed 
oligonucleotides are then ligated into BamHl-Hindlll digested pASXl to give pASX2. After 
transformation and isolation of the plasmid from £. coli, the identity of pASX2 is confirmed 
by DNA sequencing of the 1.2 kb EcoRl'Sall fragment that contains the ErmE* promoter and 
the bacteriophage fd t^erminator. 

B. Construction of plasmid pASBVII (see FIG. 81 : The 598 base pair DNA segment 
that carries the eryBVII gene, comprising coordinates 7398-7996 (FIG. 4B), is amplified by 
PCR employing two oligonucleotides, SEQ ID NO:33 (5 - 

GATCGCATGCTCrAGAGTACG.TGAGCTGGCGGTGGCGGGC-3') and SEQ ID NO:34 
(5*<;ATCCGGATCCGCATGCrT-CACCTGCCGGTGCTGGCGGG-3 ). After digestion of 
the purified PCR product with BarnHl-Xbal the PCR fragment was ligated to BamBl-Xbal 
digested pASX2 to give pASBVII. After transformation and isolation of the plasmid from E, 
colU the identity of pASBVn is verified by DNA sequencing of the 880 bp EcdSl-Xbal 
insert. 

C. Construction of .W. grvf/tr /jp/i F,p72nrpASBVm: Sac. erythraea strain ER720 
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protoplasts are transformed with plasmid pASBVII and transformants are selected for with 
thiostrepton (15 fig/ml). To confirm transformation, total DNA is isolated from Th^ colonies 
and used to transform £. colL After transformation and isolation of the plasmid from E, coli, 
the identity of pASBVII is verified by restriction analysis with the enzymes PvuU and BanHXL 

s which releases a 1 .48 kb fragment. Those Sac. erythraea colonies that are found to contain 
pASBVn are designated Sac. erythraea ER720(pASBVII). 

D. Isolation, purification, and proDerties of a^-D^mvcarosvl-S^B-EMiesosaminovl- 
12-hvdroxv-er vthronolide B from Sac, ervthraea ER720faASBVn): Sac. erythraea 
ER720(pASBVII) is fermented for 3 days in SCM media with thiostrepton selection as 

10 described in General Methods. The fermentation broth is then cooled to 4'C and adjusted to 
pH 4.0 and extracted once with methylene chloride. The aqueous layer is readjusted to pH 
9.0 and extracted twice with methylene chloride and the combined extracts are concentrated 
to a solid residue. This is digested in methanol and chromatographed over a column of 
Sephadex LH-20 in methanol. Fractions are tested for bioactivity against a sensitive 

15 organism, such as Staphylococcus aureus Th^, and active fractions are combined. The 

combined fractions are concentrated and the residue is digested in 10 ml of the upper phase of 
a solvent system consisting of n-heptane, benzene, acetone, isopropanol. 0.05 pH 7.0 
aqueous phosphate buffer (5:10:3:2:5, v/v/v/v/v), and chromatografdied on an Ito Coil Planet 
Centrifuge in the same system. Active fractions are combined, concentrated and partitioned 

20 between methylene chloride and dilute ammonium hydroxide (pH 9.0). The methylene 
chloride layer is separated and concentrated to yield the desired product as a white foam. 

Example 3: Co nstruction and characterization of Streptomyces antibioticm ATCC 
1189KpXB6^ that produces 3^es-olean drosvl>3-mvcarosvl oleandomycin 

25 

A. Constmction of plasmid pKB6 and intermediates (see FIG. 9) 

i) Construction of plasmid pK 1 : The DNA sequences of pBR322 (GenBank 
Accession #: J01749) and pUC19 (GenBank Accession #: X02514) are known. The 805 nt 
DNA segment comprising coordinates 1673 through 2478 of pBR322 is amplified by PCR 

30 employing two oligodeoxynucleotides, SEQ ID NO:35 (5'-GATCACATGTTCTTTCCTG- 
CGTTATCCCCTG-30 and SEQ ID NO:36 (5'-GATCGGATCCATGCATGTCTAGAGCA- 
TCGGAGGATGCTGCTGGC-3'). After digestion of the purified PCR product with AJISl 
and BaniHl the fragment is ligated into A/flll and fiamHI digested pUC19 to give plasmid 
pKl . The identity of plasmid pKl, after transformation and isolation from E. coli, is verified 

35 by PvmII digestion which releases fragments of 0.55 kb and 2.55 kb. Plasmid pKl contains 
the ROP region of pBR322 that controls plasmid copy number. 

ii) Construction of plasmid pKBl : The 2.24 kb DNA segment that carries the 
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eryBIV and eryBV genes, comprised between coordinates 56 and 2296 of the sequence 
presented in SEQ ID N0:2. is amplified by PGR employing two deoxyoligonucleotides, 
SEQ ID NO:37 (5 ^GAATGCATCCTGGAAAGCGAGCAAATGCTCCGGTG-3') and SEQ 
ID NO:38 (5'-GATCTAGAGCTAGCCGGCGTGGCGGCGCGTG-3*). After digestion with 
5 NsH and Xbal the fragment is ligated into Nsil and Xbal digested pKl to yield plasmid pKB 1 , 
5.3 kb in size. The identity of plasmid pKB 1 , after transformation and isolation from £. coli, 
is verified by Kpnl digestion which releases fragments of 0.72 kb, 1 . 14 kb and 3.42 kb. 

iii) Construction of plasmid pKB2 : The 1 .56 kb DNA segment that carries 
the eryBVI gene, comprised between coordinates 3121 and 4677 of the sequence presented in 

10 SEQ ID N0:2, is amplified by PGR employing two deoxyoligonucleotides, SEQ ID NO:39 
(5'-GATCGCTAGCCGTGACCGGACCCrrACAGTGAGTG- J) and SEQ ID NO:40 
(5*-GATCTAGACTrAAGTCATCCGGCGGTCCTGGTGTAGACGGC-3'). After digestion 
with Nhel and Xbal the fragment is ligated into Nhel and Xbal digested pKB 1 to give plasmid 
pKB2, 6.9 kb in size. The identity of plasmid pKB2, after transformation and isolation from 

15 £. coli, is confirmed by BamHl digestion which releases fragments of 0.22 kb, 0.40 kb, 2.6 
kband3.7kb. 

iv) Construction of plasmid pKB3: The 0.6 kb DNA segment that carries the 
ery BV7/gene, comprised between coordinates 7385 and 7987 of the sequence presented in 
SEQ ID NO:2, is amplified by PGR employing two deoxyoligonucleotides, SEQ ID N0:41 

20 (5*-GATCTTAAGAACCGGAGTTGCGAGTACGTGAGCTGGCG-30 and SEQ ID NO:42 
(5*-GATCTAGACCTAGGTCACCrGCCGGTGCTGGCGGGCTC-3 ). After digestion with 
Aflll and Xbal the fragment is ligated into Aflll and Xbal digested pKB2 giving plasmid 
pKB3, 7.5 kb in size. The identity of plasmid pKB3, after transformation and isolation from 
E. coli, is verified by Pstl digestion which releases fragments of 1.1 kb and 6.4 kb. 

25 v) Construction of plasmid pKB4 : The 1 .0 kb DNA segment that carries the 

eryW/ grae, comprised between coordinates 2385 and 3410 of the sequence presented in 
SEQ ID NO: 1 , is amplified by PGR employing two deoxyoligonucleotides, SEQ ID NO:43 
(5^GATCCTAGGCCGCAGGAAGGAGAGAACCACG-3*) and SEQ ID NO:44 
(5 -GATCTAGATTAATCACrGCAACCAGGCTTCCGGC-3*). Following digestion with 

30 i4 vrll and Xbal the fragment is ligated into Avrll and Xbal digested pKB3 yielding the desired 
plasmid pKB4. After transformation and isolation of the plasmid from £ coli, the identity of 
pKB4, 8.5 kb in size, is verified by Bgia and EcoKl digestion which releases fragments of 
0.41 kb, 1.6 kb, 3.1 kb and 3.4 kb. 

vi) Constmction of plasmid pKB5: The DNA sequence of eryBIII has been 

35 reported (Haydock era/ (1991) Afo/ Gen Genet 230:120). The 1.3 kb DNA segment that 
carries the eryBIII gene, comprised between coordinates 3965 and 5232 of the sequence 
depicted in Haydock et aU is amplified by PGR employing two deoxyoligonucleotides, SEQ 
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iD NO:45 (5*-GATTAATTGGCCGCGGCGCGGCGCTC-GTTATG-3*) and SEQ ID NO:46 
(5'-GATCrAGATAATTAATCATACGACTTCCAGTC-GGGGTAG-30. After digestion 
with Msel and Xbal the fragment is ligated into Msel and Xbal digested pKB4 to give the 
desired plasmid pKB5, 9.8 kb in size. The identity of pKB5, after transformation and 
isolation from E. colU is verified by Pstl digestion which releases fragments of 1.1 kb, 2.5 kb, 
and 6.1 kb, visualized by gel electrophoresis. 

"vii) Construction of plasmid pKB6: Theeryfi/ gene has been mapped 
(Haydock et al (1991) Mol Gen Genet 230: 120) and the DNA sequence on both flanks of 
eryBI is known (Haydock et al (1991) Mo/ Gen Genet 230:120) and GenBank Accession # 
Ml 1200. The 2.5 kb DNA segment that carries the eryBI gene, comprised between 
coordinates 1 . 1 and 3.6 of the map presented in Haydock et al, is amplified by PCR 
employing two deoxyoligonucleotides: SEQ ID NO:47 (5'-GATTAATTAATGATCA- 
AGCTG AA AATTGTTTGCATG-3*) and SEQ ID NO:48 (S'-GATCTAGACTGCCGGCT- 
CAGCCTTCCCAGGTTCG-3'). After digestion with Pad and Xbal the fragment is ligated 
into Pad and Xbal digested pKB5 to give plasmid pKB6. 12.3 kb in size. The identity of 
pKB6, after transformation and isolation from £ coZi, is verified by BamtU digestion which 
releases fragments of 0.22 kb, 0.40 kb, 1 .4 kb, 2.6 kb, 3.3 kb and 4.4 kb. Plasmid pKB6 
carries all of the eryB genes, eryBI-eryBVII, that are involved iii the biosynthesis of mycarose 
and its attachment to the polyketide. 

B. Constniction of P lasmid pXSB6 (see FIG. 1 1): The 9.2 kb A^5fl-XM segment of 
pKB6, prepared as described in Example 3(A)(vii) above, that carries all of the eryB genes is 
isolated and ligated into Pstl-Xbal digested pASX2, prepared as described in Example 2(A) 
above, to give plasmid pXSB6. After transformation and isolation of the plasmid from E. 
coli, the identity of pXSB6, 17.2 kb in size, is verified by the observation of fragments of 
0.4 1 kb. 1 .9 kb, and 14.9 kb after EcoRl digestion. Plasmid pXSB6 carries all of the eryB 
genes in a transcriptional fusion downstream of the ermE* promoter oil an £. coli- 
5/repfomyce5 shuttle plasmid. 

C. Construction of Plasmid pXB6 

i) Construction of plasmid pN7Q2 (see FIG. 10) : Two oligonucleotides of the 
sequences: SEQ ID NO:49 5'-GGAATTCAGATCTATGCATTCrAGAA-3') and 
SEQ ID NO:50 (5'-CGCGTTCTAGAATGCATAGATCTGAATTCCTGCA-3') that include 
restriction enzyme sites for the enzymes EcoRI, Bg/II, Nsih and Xbal and overhanging ends 
compatible with Pstl and Mlul are synthesized. Approximately 250 ng of each 
oligonucleotide are then mixed together in TE buffer and heated to 99*C for 1 min. After the 
solution is cooled slowly to room temperature allowing the oligonucleotides to anneal due to 
self complementarity, the annealed oligonucleotides are ligated into PsthMlul digested 
pU702 to yield the desired plasmid pN702. After transformation and isolation of the plasmid 



wo 97/23630 



PCT/US9OT0238 



27 

from Streptomyces lividans 1326, the identity of plasmid pN702, 4.3 kb in size, is verified by 
the observation of fragments of 0.75 kb and 3.6 kb after EcoRl-BamHl orXbal-BamHl 
digestion. 

ii) Construction of plasmid pXl (see FIG. lOV . The 290 bp £coRI-flamHl 
5 segment that carries the ermE* promoter is isolated from plasmid pD4070 and ligated into 
EcoRl-BgUl digested pN702 to give plasmid pXl. The resulting mixture contains the desired 
plasmid pXl . After transformation and isolation of the plasmid from Streptomyces lividans 
1326, the identity of plasmid pXU 4.6 kb in size, is verified by the observation of fragments 
of 1 .0 kb and 3.6 kb after Nsil-BamWl digestion. 
10 iii) Construction of plasmid pXB6 fsee FIG. 1 1) : The 9.2 kb Nsa-Xbal 

segment of pKB6, prepared as described in Example 3(A)(vii) above, that carries all of the 
eryB genes is isolated and ligated into Nsil-Xbal digested pXl to give the desired plasmid 
pXB6. After transformation and isolation of the plasmid from Streptomyces lividans 1326, 
the identity of plasmid pXB6, 13.8 kb in size, is verified by the observation of fragments of 
IS 0.41 kb, 1 .9 kb, and 1 1 .5 kb after EcoRl digestion. Plasmid pXB6 carries all of the eryB 
genes in a transcriptional fusion to the ermE* promoter on a Streptomyces plasmid. 

D. CpBSftup^iOT Qf Strmomce^ gntipiOtim ATCC I ]i89UpXP6): Approximately 
500 ^g of plasmid pXB6, isolated from Streptomyces lividans 1326(pXB6), are 
electroporated into the oleandomycin producer Streptomyces antibioticus ATCC 1 1891 and 

20 several of the resulting Thio^ colonies that appear on the R3M-agar plates containing 

thiostrepton are analyzed for their plasmid content. The presence of plasmid pXB6, 13.8 kb 
in size, is verified by the observation of fragments of 0.41 kb, 1.9 kb, and 1 1.5 kb after EcoTil 
digestion. 

E. Isolation, purification, and properties of 3-des-oleandrosvl-3-mvcarosvl 
25 oleandomvcin from Streotomvces antibioticus ATCC 1 189KpXB61: Streptomyces 

antibioticus ATCC 1 1891(ipiXB6) is fermented for 5 days in SCM media with tiiiostrepton 
selection as described in General Methods. The fermentation broth is then cooled to 4*C and 
adjusted to pH 4.0 and extracted once with metiiylene chloride. The aqueous layer is 
readjusted to pH 9.0 and extracted twice witii methylene chloride and die combined extracts 

30 are concentrated to a solid residue. This is digested in methanol and chromatographed over a 
column of Sephadex LH-20 in methanol. Fractions are tested for bioactivity against a 
sensitive organism, such as Staphylococcus aureus Th^, and active fractions are combined. 
The combined fractions are concentrated and the residue is digested in 10 ml of the upper 
phase of a solvent system consisting of n-heptane, benzene, acetone, isopropanol, 0.05 M, pH 

35 7.0 aqueous phosphate buffer (5: 10:3:2:5, v/v/v/v/v), and chromatographed on an Ito Coil 
Planet Centrifuge in the same system. Closely eluting active fractions are combined, 
concenu*ated and partitioned between methylene chloride and dilute ammonium hydroxide 
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(pH 9.0). The methylene chloride layer is separated and concentrated to yield the desired 
product as a white foam. 

]p y^ii[iple 4: Construction and chara cterization of Streptnmvcex violaceonieer NRRL 
9R-^iifpXC4) that produces 5-de s-chalcosvl-S-desosaminovl lankamvcin 

A. rnnstniction o f plasmid pKC4 and intermediates (8Ce FIG. 12) 

i) rnnstniction of plasmid pKCl: Tlie 2.4 kb DNA segment that carries the 
eryCn and eryCIII genes, comprised between coordinates 33 and 241 3 of the sequence 
presented in SEQ IDNO:l, is amplified by PCR employing two deoxyoligonucleotides, 
SEQ ID NO:51 (5'-GAATGCATCTGGCTGGGCGGAGGGAATTCATG-3) and 

SEQ ID NO:52 (S'-GATCTAGACrTAAGTCATCGTGGTTCTCTCCITCCrGC 
GGC-3'). After digestion with Nsil and Xbal the purified PCR fragment is ligated into Nsfl 
and Xbal digested pKl to give plasmid pKCl. 5.5 kb in size. The identity of plasmid pKCl, 
after transformaUon aiid isolation from £. coli. is verified by BcoRI digestion which releases 
fragments of 2.2 kb and 3.3 kb. 

ii) rnnstniction of plasmid pKC2 : The 732 bp DNA segment that carries the 
eryCVI gene, comprised between coordinates 2331 and 3063 of the sequence presented in 
SEQ ID NO:2, is amplified by PCR employing two deoxyoligonucleotides. 

SEQ ID NO:53 (S'-GATCCTTAAGCTCCGGAGGGAGCAGGGATG-B ) and 
SEQ ID NO:54 (5 -GATCrAGACCrAGGTCATCCGCGCACACCGACGAAC-3 ). After 
digestion with A/m and Xbal the purified PCR fragment is ligated mtoA/ai and Xbal 
digested pKCl to give plasmid pKC2, 6.2 kb in size. The identity of plasmid pKC2, after 
transformation and isolation from E. coli, is verified by Xbal-Ecom digestion which releases 

fragments of 0.95 kb, 2.2 kb and 3. 1 kb. 

iii) rnnstniction of plasmid pKC3 : The 2.7 kb DNA segment that carries the 
eryCIV and cryCV genes, comprised between coordinates 4650 and 7386 of the sequence 
presented in SEQ ID NO:2. is amplified by PCR employing two deoxyoligonucleotides, 
SEQ ID NO:55 (5'-GATCCTAGGCCGTCTACACCAGGACCGCCGG-3') and 

SEQ ID NO:56 (5'-GATCrAGATTAATCACCTTCCGCGCAGGAAGCCGC-3'). After 
digestion with Avrll and Xbal the purified PCR fragment is ligated into AvrU and Xbal 
digested pKC2 to yield plasmid pKC3. 9.0 kb in size. The identity of plasmid pKC3, after 
transformation and isolation from E. coli. is verified by Sphl digestion which releases 

fragments of 4.0 kb and 5.0 kb. 

iv) rnnstniction of pla.smid pKC4: The DNA sequence of the eryCI gene has 
been determined (GenBank Accession #X15541). The 1.1 kb DNA segment that carries the 
eryCI gene, comprised between coordinates 38 and 1 161 of the sequence indicated above, is 
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amplified by PCR employing two deoxyoligonucleotides, SEQ ID NO:57 (5*-GATCTTAAG- 
CCGCCACTCGAACGGACACTCG-3') and SEQ ID NO:58 (5'-GATCTAGATCAAGCCC- 
GAGCCTTGAGGG-3'). After digestion with Msel md Xbal the fragment is ligated into 
Msel and Xbal digested pKC3 to give plasmid pKC4, 10.1 kb in size. The identity of plasmid 
pKC4, after transformation and isolation from £. coli, is verified by Kpnl digestion which 
releases fragments of 0.15 kb, 0.31 kb, 4.1 kb and 5.5 kb. Plasmid pKC4 carries all of the 
eryC genes, eryCI-eryCVI, that are involved in the biosynthesis of desosamine and its 
attachment to the polydetide. 

B. Construction of Plasmid pXSC4 (see FIG. 13^ : The 6.9 kb Afeil-Xfcal segment of 
pKC4 that carries all of the eryC genes is isolated and ligated into Pstl-Xbal digested pASX2» 
prepared as described in Example 2(A), to give the desired plasmid pXSC4, 14.9 kb in size, 
wherein all of the eryC genes are transcriptionally linked downstream of the ermE* promoter 
on an £ coli-Streptomyces shuttle plasmid. The identity of plasmid pXSC4, after 
transformation and isolation from E. coli, is verified by the observation of fragments of 0.29 
kb, 2.2 kb, and 12.4 kb after EcdBl digestion . 

C. rnn^ttniction of Pla5;mid pXC4 (see FIG. 13V The 6.9 kb Nsa-Xbal segment of 
pKC4 that carries all of the eryC genes is isolated and ligated into Nsil-Xbal digested pXl , 
prepared as described in Example 3(C)(ii), to give the desired plasmid pXC4, 1 1.5 kb in size, 
wherein all of the eryC genes are transcriptionally linked downstream of the ermE* promoter 
on a Streptomyces plasmid. After transformation and isolation of the plasmid from 
Streptomyces lividans 1326, the identity of plasmid pXC4 is verified by the observation of 
fragments of 0.29 kb, 2.2 kb, and 9.0 kb after EcdKi digestion. 

D. Construction of Streptomyces violaceonher NRRL 2834(pXC4V Approximately 
500 ^ig of the plasmid pXC4, isolated from Streptomyces lividans I326(pXC4) , are 
electroporated into the lankamycin producer Streptomyces violaceoniger NRRL 2834 andf . 
several of the resulting Thj(^ colonies that appear on the R?M^agar , plates containing 
thiostrepton are analyzed for their plasmid content The presence of plasmid pXC4 is verified 
by the observation of fragments of 0.29 kb, 2.2 kb. and 9. 1 kb in size after EcoKl digestion 
of the plasmid. 

E. Isolation, purification, and properties of 5-des- chalcosvl-5-desosaminovl 
lankamvcin : 5. violaceoniger NRRL 2834(pXC4) is fermented for 5 days in SCM media 
with thiostrepton selection as described in General Methods. The fermentation broth is then 
cooled to 4'C and adjusted to pH 4.0 and extracted once with methylene chloride. The 
aqueous layer is readjusted to pH 9.0 and extracted twice with methylene chloride and the 
combined extracts are concentrated to a solid residue. This is (Ugested in methanol and 
chromatographed over a colunui of Sephadex LH-20 in methanol. Fractions are tested for 
bioactivity against a sensitive organism, such as StaphylococctiS aureus Th^, and active 
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fractions arc combined. The combined fractions are concentrated and the residue is digested 
in 10 ml of the upper phase of a solvent system consisting of n-heptanc, benzene, acetone, 
isopropanol, 0.05 M. pH 7.0 aqueous phosphate buffer (5:10:3:2:5, v/v/v/v/v), and 
chromatographed on an Ito Coil Planet Centrifuge in the same system. Active fractions arc 

5 combined, concentrated and partitioned between methylene chloride and dilute ammonium 
hydroxide (pH 9.0). The methylene chloride layer is separated and concentrated to yield the 
desired product as a white foam. 

Although the present invention is illustrated in the exannples listed above in terms of 
prcf Nred embodiments, these examples are not to be regarded as limiting the scope of the 

10 invention. The above illustrations serve to describe the principles and methodologies 

involved in creating the types of genetic alterations that can be introduced into Sac. erythraea 
and/or other Streptomyces that result in the synthesis of novel glycosylation-modificd 
polyketide products. Although a single Type 1 alteration, leading to the production of for 
example, 4"-deoxy-4"-oxo-erythromycin A, is specified herein, it is obvious to those skilled 

15 in the art that other Type I changes can be introduced into the eryB and/or eryC genes leading 
to novel glycosylation-modified polyketide structures. Examples of additional Type I 
alterations leading to useful novel compounds include but are not limited to: mutations in the 
eryBVII gene conceivably leading to 3"a-D-mycarosyl-5-B-D-desosaminoyM2-hydroxy- 
erythronolide B and mutations in the eryCVI gene conceivably leading to N-3a'-des-dimethyl 

20 erythromycin A. Moreover, it is obvious that Type I alterations in two or more different eryB 
and/or eryC genes can be combined leading to novel glycosylation-modified polyketide 
structures. Examples of combinations of two Type I alterations leading to useful compounds 
include but are not limited to: mutations in the eryBIV and eryBVII genes conceivably leading 
to 3'a-D-4"-deoxy-4"-oxo-mycarosyl-5-B-I>desosaminoyM2-hydroxy-erythronolide B; 

25 mutations in the eryBIV and eryCVI genes conceivably leading to 4"-deQxy-4--oxo-(N-3a;- 
des-dimethyl)-ery thromycin A; and tnutations in the eryBIV, eryBVII » aiid eryOihl genes 
conceivably leading 10 3-a-IM"-deoxy-4"-oxo-mycarosyl-5-6-D-(N-3tf 
desosaminoyH2-hydroxy-erythronolide B. All Type I mutations or combinations of two or 
more Type I mutations in the eryBII, eryBIV. eryBV, eryBVl eryBVII eryCIl eryCIIL 

30 eryCIV, eryCV. or eryCVI genes, the Sac, erythraea strains that carry said mutations or 
combinations of mutations, and the corresponding polyketides produced from said strains, 
therefore, are included within the scope of the present invention. 

Although the Type II mutation specified herein was constructed with the eryBVII gene 
on a self-replicating plasmid it is obvious that other eryB genes and eryC genes can be 
35 expressed in an antisense orientation leading to novel glycosylation-modified polyketide 
structures. Examples of additional Type II alterations leading to useful compounds include 
but are not limited to: antisense expression of the eryBIV gene conceivably leading to 4"- 
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deoxy-4"-oxo-erythromycin A and antisense expression of the eryCVI gene conceivably 
leading to N-3a'-des-dimethyl erythromycin A. Moreover, it will occur to those skilled in the 
art that promoters other than the ermE* promoter, for example the melC promoter of pU702, 
will be suitable for antisense expression, and that many self-replicating vectors in addition to 

s pWHM4 will function to carry the antisense alteration. It will also occur to those skilled in 
the art that a self-replicating vector is not required for this invention and that the antisense 
alteration can be introduced directly into the chromosome using the same principles 
employed to construct a Type I gene alteration. An example of a Type II alteration that is 
introduced directly into the chromosome is the eryBVII antisense alteration described in 

10 Example 2 wherein DNA segments immediately upstream of the eryK gene are used to flank 
the cr7n£-eryjBV7/-phage fd terminator grouping in a pWHM3 vector, and this vector is 
integrated into and then resolved from the chronnosome leaving the ermE*'eryBVJI'ph2igt fd 
terminator grouping stably incorporated into this nonessential region of the chromosome of 
Sac. erythrqea conceivably leading to the production of 3-a-D-mycarosyl-5-B-D- 

15 desosaminoyH2-hydroxy-eiythronolide B, All Type II mutations in the eryBII, eryBIV, 
eryBV, eryBVL eryBVII, eryCII, eryCIIl eryCIV, eryCV, or eryCVI genes whether carried on 
a self-replicating plasmid or integrated into a nonessential region of the chromosome, the Sac, 
erythraea strains that carry said mutations, and the corresponding polyketides produced from 
said strains, therefore, are included within the scope of the present invention. 

20 Although Type HI alterations, leading to the production of 5-des-chalcosyl-5- 

desosaminoyl lankamycin in Streptomyces violaceonigermA 3-des-oleandrosyl-3-mycarosyl 
oleandomycin in Streptomyces antibioticus^ are specified herein, it is obvious that Type III 
alterations can be introduced into any polyketide producing microorganism leading to novel 
glycosylation modifled polyketides. It will also occur to those skilled in the art that both the 

25 eryB and eryC genes can either be cotransformed into a polyketide producing microorganism 
( or grouped together on a single vector that is introduced into a polyketide producing 
microorganism. An example of a Type HI change using both the eryB and eryC genes 
together is their introduction into Streptomyces violaceoniger conceivably leading to 3-des- 
{4"-0-acetylarcanosyl)-3-mycarosyl-5-des-chalcosyI-5-desosaminoyl lankamycin. Although 

30 the Type ID alterations specified herein have indicated a specific genetic order of the eryB or 
eryC genes, it will occur to those skilled at the art that many different genetic arrangements of 
the eryB oxeryC genes will produce similar results. It will also that occur to those skilled at 
the art that certain arrangements of the eryB and/or eryC genes that lack one or more of the 
respective eryB and/or eryC genes will lead to the production of novel glycosylated 

35 polyketides in which intermediate compounds in the biosynthesis of mycarose and/or 
desosamine, respectively, such as those outlined in FIGS. 2 and 3, are attached to the 
polyketide. An example of a Type III alteration in which only a subset of the eryB and/or 
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eryC genes arc used is the introduction of a pXC4 derivative that lacks the eryCVI gene, 
removed by digestion of plasmid pXC4 with AJUL and A vrll followed by treatment with the 
Klenow fragment of DNA polymerase I and religation, into Streptomyces violaceoniger 
leading to the production of to 5-des-chalcosyl-5-(N-3a-des-dimethyl desosaminoyl) 
5 lankamycin. It will also that occur to those skilled at the art that promoters other than 
ermE or ennE\ such as the melC promoter of plasmid pU702, and vectors other than 
pWHM4 or pU702 can also be utilized in the construction of a Type m alteration, and these 
variants are, of course, considered to be within the scope of the invention. Finally, it will also 
occur to those skilled in the art that a self-replicatuig vector is not required for this invention 
10 and that an assembly of sugar biosynthesis genes can be introduced directly into the 

chromosome of a heterologous host using the same principles employed to construct a Type I 
gene alteration once a nonessential region of the heterologous host chromosome has been 
identified. Alternatively, plasmids or bacteriophages which undergo site-specific 
recombination with host genes may also be used to introduce eryB and eryC genes into a host 
15 to effect Type III alterations. All Type III alterations using one or more of the eryBII, 
eryBIV, eryBV, eryBVl eryBVII, eryCII, eryCIII, eryCIV, eryCV, or eryCVI genes, the 
polyketide producing strains that carry said alterations, and the corresponding polyketides 
produced from said strains, therefore, are included within the scope of the present invention. 
In addition, it is also possible to create combinations of Type I and Type II alterations 
20 such that some Type I eryB and/or eryC mutations are introduced directly into the Sac. 
erythraea chromosome in the appropriate locus, while other eryB and/or eryC genes are 
inactivated by Type 11 alterations using a self-replicating or integrating vector. For example, 
combination of a Type I alteration, such as a mutation in eryBIV, and a Type n alteration, 
such as transformation with pASBV//, will conceivably lead to production of 3-a-D-4"- 
25 deoxy-4*'-oxo-mycarosyl-5-B-D-desosaminoyM2-hydroxy-erythronolide B. All 

combinations of two pr more alterations of Type I and Type 11, the Sac. erythraea strains that 
carry such alterations, and the glycosylated polyketides produced from such strains are 
included within the scope of the present invention. 

As an extension of the examples reported with the eryB and/or eryC genes, it is 
30 possible to apply the method described herein to heterologous sugar biosynthesis genes that 
are similar to the eryB and/or eryC genes. The construction of strains carrying heterologous 
sugar biosynthesis genes that lead to the production of novel glycosylated polyketides 
requires: (i) cloning of the sugar biosynthesis genes from any other glycosylated-polyketide 
producing actinomycete, (ii) determining the nucleotide sequence of the cloned gene(s); (iii) 
35 excising and assembling the cloned gene(s) into vectors suitable for Type I, Type II, or Type 
in alterations; and (iv) transformation of polyketide producing microorganisms and screening 
for the novel compound. Any polyketide-associated sugar biosynthesis gene can thus be 
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precisely excised fix>in the genome of a glycosylated polyketide producing microorganism 
and altered or arranged with other sugar biosynthesis genes and then introduced into the same 
or another polyketide producing microorganism to create a novel glycosylated polyketide of 
predicted structure. Thus^ for example, a Type I or Type II alteration of a heterologous gene 

5 that is similar to an eryB and/or eryC gene, such as can be found in the eryfiV77homolog for 
the synthesis of L-oleandrose in Streptomyces antibioticus, to result in the production of 3- 
des-L-oleandrosyl-3-I>-oleandrosyl oleandomycin is included within the scope of the present 
invention. Similarly, a Type IE assembly of the genes for the synthesis of a sugar other than 
mycarosc or desosamine, such as can be found in the genes for the synthesis of angolosamine 

10 in Streptomyces eurythermus, and their transformation into Sac, erythraea to result in the 
synthesis of S-des-desosaminoyl-S-angolosaminoyl-erytbromycin A is included within the 
scope of the present invention. 

It will occur to those skilled in the art that the Type I, Type n, and Type m genetic 
manipulations described herein and the polyketide producing microorganisms into which they 

IS are introduced are in no way exclusive. Hence, the choice of a convenient host and the 
choice of a Type I, Type U, or Type III alteration is based solely on the relatedness of the 
desired novel glycosylated polyketide to a natural counterpart Therefore, Type I, Type II. 
and Type III alterations can be constructed in any polyketide producing microorganism 
employing either endogenous or exogenous sugar biosynthesis genes. Thus all Type I, Type 

20 n, and Type in mutations or various combinations thereof constructed in any polyketide 
producing microorganism according to the principles described herein, and the respective 
polyketides produced from such strains, are included within the scope of the present 
invention. Examples of glycosylated polyketides that can be altered by creating Type I, Type 
n, or Type III changes in the producing microorganisms include, but are not limited to 

25 macrolide antibiotics such as erythromycin, tylosin, spiramycin, etc; aromatic polyketides 
such as daunorubicin and doxorubicin, etc; polyenes such as candicidin, amphotericins, etc; 
and other complex polyketides such as avermectin. 

^ Whereas the novel derivatives or modifications of erythromycin described herein have 
been specified as the A derivatives, such as 4'**deoxy-4'*-oxo-erythromycin A, those skilled in 

30 the art understand that the wild type strain of Sac, erythraea produces a family of 

erythromycin compounds, including erythromycin A, erythromycin B, erythromycin C, and 
erythroniycin D. Tlius, modified strains of Sac. erythraea^ such as strain ERBIV, for 
example, would be expected to produce the corresponding members of the 4"-deoxy-4"-oxo- 
erythromycin family, including 4"-deoxy-4"-oxo-erythromycin A, 4"-deoxy-4"-oxo- 

35 erythromycin B, 4"-deoxy-4"-oxo-erythromycin C, and 4"-deoxy-4"-oxo-erythromycin D. 
Similarly, all other modified strains oiSac. erythraea that produce novel glycosylated 
erythromycin derivatives would be expected to produce the A, B, C, and D forms of said 
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derivatives. For example, modified Sac. erythraea strains that produce 6-deoxyerythromycin, 
6,12-dideoxyerythromycin and 6,7-anhydroerythromycin would be expected to produce novel 
glycosylation-modified polykctides by introduction of the additional modification of a Type 
I, II or ni change in a sugar biosynthesis gene. Therefore, all members of the family of each 
5 of the novel erythromycins described herein or produced by these methods are included 
within the scope of the present invention. 

Variations and modifications of the methods for obtaining the desired plasmids, hosts 
for cloning and choices of vectors and eryB and/or eryC genes to clone and modify, other 
than those described herein will occur to those skilled in the art. For example, although we 
10 have described the use of plasmids pWHM3. pWHM4, and pU702, other vectors can be 
employed wherein all or part of said plasmids is replaced by other DNA segments that 
function in a similar manner, such as replacing the pUC19 component of pWHM3 and 
pWHM4 with pBR322, available from BRL; or employing different segments of the pDlOl 
rcplicon in pWHM3 and pD702, or the pJVl replicon in pWHM4, respectively; or employing 
15 selectable maricers other than thiostrepton- or ampicillin-resistance. These arc just a few of a 
long list of possible examples all of which are included within the scope of the present 
invention. Similarly, the segments of the eryB and eryC loci that have been specified herein 
to generate the various Type I, Type II, and Type III alterations can readily be substituted for 
other segments of different length encoding the same functions, either produced by PCR- 
20 amplification of genomic DNA or of an isolated clone, or by isolating suitable restriction 
fragments from Sac. erythraea. In the same way it is possible to create Type I mutations 
functionally equivalent to those described herein by altering through deletion, insertion, or 
site directed mutagenesis different portions of the corresponding genes. It is also possible to 
create Type n mutations functionally equivalent to those described herein by employing 
25 larger or smaller portions of the corresponding genes; and it is possible to create Type III 
mutations using larger or smaller segments of the corresponding genes in the same or 
different linear order described herein. Additional modifications include changes in the 
restriction sites used for cloning or in the general methodologies described above. All such 
changes are included in the scope of the present invention. It will also occur to those skilled 
30 in the art that different methods are available to ferment Sac. erythraea and other polyketide 
producing microorganisms and to extract the novel polyketides specified herein, and all such 
methods are also included within the scope of this invention. 

It will also be apparent that many modifications and variations of the invention as set 
forth herein are possible without departing from the spirit and scope thereof, and that, 
35 accordingly, such limitations arc imposed only as indicated by the appended claims. 
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We claim: 

1 . An isolated single or double stranded polynucleotide having a nucleotide sequence 
which comprises (a) a nucleotide sequence selected from the group consisting of (i) the 
sense sequence of SEQ ID N0:1 Arom about nucleotide position 54 to about nucleotide 
position 1 136; (ii) the sense sequence of SEQ ID N0:1 from about nucleotide position 1 147 

5 to about nucleotide position 2412; (iii) sense sequence of SEQ ID NO: 1 from ^ut 

nucleotide position 2409 to about nucleotide position 3410; (iv) die sense sequence of SEQ 
ID N0:2 from about nucleotide position 80 to about nucleotide position 1048; (v) the sense 
sequence of SEQ ID N0:2 from about nucleotide position 1048 to about nucleotide position 
2295; (vi) the sense sequence of SEQ ID N0:2 from about nucleotide position 2348 to about 

10 nucleotide position 3061 ; (vii) the sense sequence of SEQ ID N0:2 from about nucleotide 
position 3214 to about nucleotide position 4677; (viii) the sense sequence of SEQ ID NO:2 
from about nucleotide position 4674 to about nucleotide position 5879; (iv) the sense 
sequence of SEQ ID N0:2 from about nucleotide position 5917 to about nucleotide position 
7386; and (x) die sense sequence of SEQ ID NO:2 from about nucleotide position 7415 to 

] 5 about nucleotide position 7996; 

(b) sequences complementary to the sequences of (a); 

(c) sequences that, on expression, encode a polypeptide encoded by the 
sequences of (a); and 

(d) analogous sequences that hybridize under stringent conditions to the 
20 sequences of (a). 

2. The polynucleotide of claim 1 that is a DNA molecule or RN A molecule. 

3. The polynucleotide of claim 2 wherein die nucleotide sequence is the nucleotide 
sequetice of (a) selected from the group consisting of (i) die sense sequence of SEQ ID NO: 1 
from about nucleotide position 54 to about nucleotide position 1 136; (ii) the sense sequence 
of SEQ ID NO: 1 from about nucleotide position 1 147 to about nucleotide position 2412; (iii) 

s die sense sequence of SEQ ID N0:2 from about nucleotide position 2348 to about nucleotide 
position 3061 ; (iv) the sense sequence of SEQ ID NO:2 from about nucleotide position 4674 
to about nucleotide position 5879; and (v) die sense sequence of SEQ ID N0:2 from about 
nucleotide position 5917 to about nucleotide position 7386. 

4. The polynucleotide of claim 2 wherein the nucleotide sequence is die nucleotide 
sequence of (a) selected from the group consisting of (i) sense sequence of SEQ ID N0:1 
from about nucleotide position 2409 to about nucleotide position 3410; (ii) the sense 
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sequence of SEQ ID NO:2 from about nucleotide position 80 to about nucleotide position 
1048; (iii) the sense sequence of SEQ ID N0:2 from about nucleotide position 1048 to about 
nucleotide position 2295; (iv) the sense sequence of SEQ ID N0:2 from about nucleotide 
position 3214 to about nucleotide position 4677; and (v) the sense sequence of SEQ ID N0:2 
from about nucleotide position 7415 to about nucleotide position 7996. 

5. The polynucleotide of claim 2 wherein the nucleotide sequence is the nucleotide 
sequence of (a) having the sense sequence of SEQ ID N0:2 from about nucleotide position 
80 to about nucleotide position 1048. 

6. A vector comprising the DNA molecule of claim 2. 

7. The vector of claim 6 further comprising an enhancer-promoter operatively linked to 
the polynucleotide. 

8. The vector of claim 6 wherein the polynucleotide has the nucleotide sequence of 
claim 5. 

9. A host cell transformed with the vector of claim 6 or claim 7 or claim 8. 

10. The transformed host cell of claim 9 that is a bacterial cell. 

1 1 . The transformed host cell of claim 10 wherein the bacterial cell is selected from the 
group consisting of Streptomyces and E. coli 

12. A method for directing the biosynthesis of specific glycosylation-modified 
polyketides by genetic manipulation of a polyketide-producing microorganism, said method 
comprising the steps of: ■ 

(1 ) isolating a sugar biosynthesis gene-containing DNA sequence according to claim 

1; 

(2) identifying within said gene-containing DNA sequence one or more DNA 
fragments responsible for the biosynthesis of a polyketide-associated sugar or its attachment 
toapolyketide; 

(3) creating one or more specified changes into said DNA fragment or fragments, 
thereby resulting in an altered DNA sequence; 

(4) introducing said altered DNA sequence into a polyketide-producing 
microorganism to replace the original sequence, said altered DNA sequence, when translated. 
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resulting in altered enzymatic activity capable of effecting the production of said specific 
glycosylation-modified polyketide; 

(5) growing a culture of said altered polyketide-producing microorganism under 
conditions suitable for the formation of said specific glycosylation-modified polyketide; and 

(6) isolating said specific glycosylation-modified polyketide from said culture. 

13. The method of claim 12 wherein said specified change in said DNA fragment or 
fragments results in the inactivation of at least one enzymatic activity involved in the 
biosynthesis of a polyketide-associated sugar or in its attachment to a polyketide. 

14. The method of claim 1 3 wherein said polyketide-associated sugar is L-mycarose. 

1 5. The method of claim 1 3 wherein said polyketide-associated sugar is D-desosamine. 

1 6. A method for directing the biosynthesis of specific glycosylation-modified 
polyketides by genetic manipulation of a polyketide-producing microorganism, said method 
comprising the steps of: 

(1) isolating a sugar biosynthesis gene-containing DNA sequence according to claim 

1; 

(2) identifying within said gene-containing DNA sequence one or more DNA 
fragments responsible for the biosynthesis of a polyketide-associated sugar or its attachment 
to a polyketide; 

(3) reversing the strand orientation of said DNA fragment or fragments, thereby 
resulting in an altered DNA sequence which, when transcribed, results in production of an 
antisense mRNA; 

(4) introducing said altered DNA sequence into a polyketide-producing 
microorganism having an mRNA capable of binding to said antisense mRNA to produce an 
altered polyketide-producing microorganism capable of producing said specific 
glycosylation-modified polyketide; 

(5) growing a culture of said altered polyketide-producing microorganism under 
conditions suitable for the formation of said specific glycosylation-modified polyketide; and 

(6) isolating said specific glycosylation-modified polyketide from said culture. 

17. A method for directing the biosynthesis of specific glycosylation-modified 
polyketides by genetic manipulation of a polyketide-producing microorganism, said method 
comprising the steps of: 

(1) isolating a sugar biosynthesis gene-contauning DNA sequence according to claim 
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5 1; 

(2) identifying within said gene-containing DNA sequence one or more DN A 
fragments responsible for the biosynthesis of a polyketide-associated sugar or its attachment 
to a polyketide; 

(3) introducing said DNA fragment or fragments into a distinct polykeUde-producing 
10 microorganism to produce an altered polyketide-producing microorganism capable of 

producing said specific glycosylation-modified polyketide; 

(4) growing a culture of said polyketide-producing microorganism containing said 
DNA fragment or fragments under conditions suitable for the formation of said specific 
glycosylation-modified polyketide; and 

IS (6) isolating said specific glycosylation-modified polyketide from said culture. 

18. The meUiod of claim 13 or claim 16 or claim 17 wherein said DNA fragment 
comprises one or more genes which encode an enzymatic activity involved in the 
biosynthesis of L-mycarose or in its attachment to a polyketide. 

19. The metiiod of claim 13 or claim 16 or claim 17 wherein said DNA fragment 
comprises one or more genes which encode an enzymatic activity involved in the 
biosynthesis of D-desosamine or in its attachment to a polyketide. 

20. The method of claim 13 or claim 16 or claim 17 wherein said DNA fragment is tiie 
sequence of claim 8. 

21 . An isolated polypeptide having an amino acid sequence encoded by a nucleotide 
sequence selected from the group consisting of the sense sequence of SEQ ID NO: 1 from 
about nucleotide position 54 to about nticleotide,.pdsition 1 136; the sense sequence of SEQ ID 
N0:1 from about nucleotide position 1 147 to about nucleotide position 2412; sense sequence 

5 of SEQ ID NO: 1 from about nucleotide position 2409 to about nucleotide position 3410; tiie 
sense sequence of SEQ ID N0:2 from about nucleotide position 80 to about nucleotide 
position 1048; tiie sense sequence of SEQ ID N0:2 from about nucleotide position 1048 to 
about nucleotide position 2295; the sense sequence of SEQ ID N0:2 from about nucleotide 
position 2348 to about nucleotide position 3061 ; tiie sense sequence of SEQ ID N0:2 from 

10 about nucleotide position 3214 to about nucleotide position 4677 ; tiie sense sequence of SEQ 
ID N0:2 from about nucleotide position 4674 to about nucleotide position 5879; the sense 
sequence of SEQ ID NO:2 from about nucleotide position 5917 to iabout nucleotide position 
7386; and the sense sequence of SEQ ID N0:2 from about nucleotide position 7415 to about 
nucleotide position 7996. 
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22. An isolated polypeptide of claim 31 encoded by the sequence of SEQ ID N0:2 from 
about nucleotide position 80 to about nucleotide position 1048. 
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1 CACGCCGACGCGATCGCGCCGCACATCGACGCCTGGCTGGGCCGAGGCAATTCATGACCA 60 



6 1 CGACCGATCGCGCCGGGCTGGGCAGGCAGCTCCA6ATGATCCGCGGCCTGCACTGGGGTT 120 
TDRAGL6RQLQMIRGL HWGY 

121 ACGGCAGCAACGGCGACCCTTACCCGATGCTGCTGTGCGGACACGACGACGACCCGCAGC 180 
GSNGDPYPMLLCGHDDDPQR 



181 GCCGGTACCGCrCGATGCGCGAGTCCGGTGTGCGGCGCAGGACCGAGACGTGGGTGGTGC 240 
RYRSMRESGVRR RTETWV VA 

241 CCGACCACGCCACCGCCCGGCAGGTGCTCGAC6ACCCCGCGTTCACCCGCGCCACCG6AC 300 
DHATARQVLDDPAFTRAT G R 

301 GCACACCGGAATGGATGCGGGCCGCGGGCGCGCCACCCGCCGAGTGG^ 360 
T P E W M R A AGAPPAEHAQPFR 



361 GGGACGTGCACGCCGCGTCCT6GGAAGGCGAGGTCCCCGACGTCGGGGAACT6GCGGA 420 
DVHAASWEGEVPDV G E L A E S 

421 GCTTCGCCGGTCTGCTCCCCGGCGCGGGCGOSCGGCTGGACCTGGTCGGCGACTTCGCCT 4 80 
FAGLLPGAGARLD LVGDFAW 

481 GGCAGGTACCGGTGCAGGGCAT6ACaX:C6T6CTCGGCGCAGCCGGAGTGCrG^ 540 
QVPVQGMTAVLGAAGVLRGA 

• ••••• 

/ 541 CCGCGTGGGACGCCCGCGTCAGCCTGGACGCCCAGCTCAGCCCGCAGiCAGCTCGCGGTGA 600 
AWDARVSL DAQi.SPQQI.AVT 



601 CCGAAGCAGCGGTCGCGGCACTGCCCGCCGACCCCGCACTGCGCGCCCTGTTCGCCGGGG 660 
E A A VAALPADPALR A L F A G A 

661 CCGAGATGACCGCGAACACCGTGGTCGACGCGGTCCTGGCCCT^ 720 . 

E M T A N T V V D A V L A V S A B P 6 L 

721 TGGCCGAACGGATCGCCGACGACCCCGCCGCCGCGCAGCGAACCGTCGCCGAGGTGCTGC 780 
AERI ADDPAAAQRTVAEVLR 

. » • • • 

781 GCCTGCACCCGGCATTGCACCTG6A6CGGCGCACGGCCACCGCAGAGGT6CGGCTCGGCG 840 
LBPALHLERRTATABVRLGE 

841 AGCACGTGATC6GC6AA6GCGA06AGGTCGTGGTCGTCGTCXKX3GCGGCCAACCG 900 
HVI 6EGEEVVVVVAAA-NRDP 

901 CGGAGGTCTTCGCCGAGCCCGACCGCCTC6ACGTGGACCGCCCCGACGCCGACCGCGCGC , 960 
EVFAEPDRLDV ORPDADRAI. 
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. ♦ • • • 

961 TGTCGGCACATCGCGGCCACCCCGCCAGGCTGGAGGAGCTGGTCACCGCGCTCCCCACCG 1020 
SAHRGHPGRLEELVTALATA 

1021 CCGCACTGCGGGCCGCGGCCAAGGCGCTGCCCGGACTCACGCCCAGCGGCCCGGTCGTCC 1080 
ALRAAAKALPGLTPSGPVVR 

• • . • • . • 

1081 GGCGCCGCCGATCACCCGTCCTGCGGCGAACCAACCGCTGCCCCGTCGA6CTCTGAG6AT 1140 
RRRSPVLRGTNRCPVEL* 

1141 TCCGCGATGCGCGTCGTCTTCTCCTCCATGGCCAGCAAGAGCCACCTCTTCGGCCTCGTC 1200 
MRVVFSSMA SKSHLPGLV 

1201 CCCCTCGCATGGGCGTTCCGCGCGGCGGOGCU^CGAGGTCCGCGTGGTCGC^ 1260 
PLAWAFRAAGHEVRVVASPA 

12 61 CTCACCGAGGACATCACCGCGGCCGGGCTGACCGCCGTCCCGGTCGGCACCGACGTC^ 1320 
LTEDITAAGLTAVPVGTDVD 

1321 CTCGTGGACTTCATGACCCACGCGGGCCACGACATOlTCGACrrACGTCC^ 1380 
LVDrMTHAGHDIIDYVRSIiD 

• • • • 

1381 TTCAGC6AGCGGGACCCCGCCACCTTGACCTGGGAGCACCTGCGGG6CAT6CACACCCTG 1440 
FSERDPATLTWEHLRGMQTV 

1441 CTCACCCCGACCTTCTACGCCCTGATGAGCCXGGACACGCTCATCGAAGGCATGGTCTCG 150P 
LTPTFYALMSPDTLIEGMVS- 

. . . . • • 

1501 TTCTGCCGGAAGTGGCGGCCCGACCTGGTCATCTGGGAGCCGCTCACCTTCGCCGCGCCC 1560 
FCRKWRPDLVIWEPLTFAAP 

1561 ATCGCGGGCGCGCTGACCGGAACGCCGCACGCGCGGCTGCTGTGGGGACCCGACATCACC 1620 
lAGAVTGTPHARLLWGPDIT 

1621 AGCCGGGCGCGGCAGAACTTCCTCGGCXITGCrrGCCCGACCAGCC 1680 
T R A R Q N F t G L L P D Q P E E B R £ 

1681 GGCCCGCTCGCCGAGTGGCTCACCTGGACGCrGGAGAAGTACGGCGGCCCGGCCn 1740 
GPLAEHLTHTLEKYGGPA FD 

1741 GAGGAGGTGGTCGTCGGGCAGTGGACGATC6ACCCCGCCCCGGCCGCGATCAGGCT 1800 
EEVVVGQHTIDPAPAAIRLD 

1801 ACCGGCCTGAAGACCGTCGGGATGCGCTACGTCGACTACAACGGGCCGTCCGTGGTG^^ 1860 
TGLKTVGMRY VDYNGPSVVP 

1861 GAATGGCTGCACGACGAGCCC6AGCGCCGCCGCGTGTGCCTCACGCTCGGGATCTCCAGC 1920 
EHLHDEPERRRVCLTLGZSS 



FIG. 4A-2 



SUBSTITUTE SHEET (RULE 26) 



wo 97/23630 PCT/US96/20238 

7/45 



1921 CGCGAGAACAGCATCGGGCAGGTCTCCATCGAGGAGCTGCTGGCTGCCGTCGGCGACGTC 1980 
RENS IGQVSIEELLGAVGDV 

1981 GACGCCGAGATCATCGCGACCTTCGACGCGCAGCAGCTAGAAGGC6TCGCGAACATCCCG 2040 
DAEI lATFDAOQLEGVANIP 

2041 CACAACGTCCGCACG6TCGGCTTCGTCCCGATGC ACGCGCTGCTCCC6 ACCTGCGCGGCG 2100 
HNVRTVGFVPMHAIiLPTCAA 

2101 ACGGTGCACCAC6GC6GACCCGG6A6CTG6CACACCGCGGCGATCCACGGCGTGCC6CAG 2160 
TVBHG6PGSHHTAAIHGVPQ 

2161 GTGATCCTCCCCGACGGCT(M6ACACCGGCGT6CGCGCGCAGCGCACGCAGGAATTCG^ 2220 
VILPDGWDTGVRAQRTQEFG 

• , . . • • 

2221 GCGGGGATCGCGCTGCCCGTGCCCGAGCTGACCCCC6ACCAGCTCCGGGAGTCGGT6AAG 2280 
AGIALPVPELTPDQLRESVK 

2281 CGGGTCCTCGACGACCCGGCCCACCGCGCCGGCGCGGCGCGGATGCGCGACGACATGOfC 2340 
RVLDD PAHRAGAARHRDDML 

2341 GCGGAGCCGTCACCGGCCGAGGTCGTCGGCATCTGCGAGGAACTGGCC6CAGGAAGGA6A 2400 
AEPSPAEVVGIC.BELAAGRR 

2401 GAACCACGATGACCACCGACGCC»CGAC6»CGT6CGGCTCGGGCGTTCC6CGCT 2460 
E P R * 

MTTDAATHVRLGRSALLT 

2461 CXAGCAGGCrCTGGCTCGGCACGGTGAACTTCAGCGGACGCGTCGAGGACGACGACGCGC 2520 
SRLWLG.TVNr SGRVEODDAL 

• • ... • • 

2521 TGCGCCTGAT6GACCACGCCCGGGACCGCGGCATCAACTGCCTCGACACCGCCGACATGT 2580 
R L M D H A R D R G I N C L D T A D M Y 

' . ' r ' ■ . 

2581 ACGGCTGGCGGCTCTAOU^GGGCCACACCGAG^ 2640 
GWRLYK G HT BELVGR-WLAQG 

2641 GCGGCGGACGGCGCGAGGACACC6TGCTGGCGACCAAGGTCGGCGGCGA6ATGAGCGAGC 2700 
GGRREDTVLATKVGGEMSER 

2701 GCGTCAACGACAGCGGGCTGTCGGCGCGGaCATCaTCGCCTCCTGCGAGGGATC^ 2760 
VMDSGLSARHIZASCE6 SLR 

27 61 GCAGGCTGGGCGTCGACa^TCGACGTCTACCAGATGCACCACATCGACCGCTCCGCGC 2820 
RLGVDHIDVYQMHHIDRSAP 

, , . • • • 

2821 CGTGGGACGAGGTGTGGCAGGCCATGGACAGCCTCGTCGCCAGCGGCAAGGTCTCCTACG 2880 
WDEVMQAMDSLVA S GKVS YV 
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28B1 TCGGCrCGTCCAACTTCGCGGGCTGGCACATCGCCGCCGCGCAGGAGAACGCCGCCCGCC 2940 
GSSNFAGW UXAAAQCHAARR 

2941 GCCACTCCCTG6GCATGGTCTCCCACCAGTGCCTGTACAACCT6GCGGTCCGGCACGCC6 3000 
HSLGHVS BQCLYNLAVRBAE 

3001 AGCTGGAGGTGCrrGCCCGCCGCGCAGGCCTACGGGCTCGGCGTCrrCGCCtGGTCGCC^ 30S0 
LEV LP AAQAY6 L G V FAHS P L. 

3061 TGCACGGCGGCCTGCTCAGCGGAGCGCTGGAGAAGCTGGCCGCGGGCACCGCG^ 3120 
H6GLLSGA LEKLAA6TAVKS 



3121 CGGCGCAGGGCCGTGCGCAGGTGCTGTTGCCGTCCCTGC6CCCGGCGATCGAGGCCTACG 3180 
AQGRAQVLLPSLRPAIEAYE 

318 1 AGAAGTTCTGCCGCJUICCTCGGCGAAGACCCGGCCGAGGTGGGGCTCGCA^ 3240 
KFCRNLGEOPAEVGLAHVLS 



32 4 1 CCCGGCCCGGCATCGCCGGCGCCGTCATCGGCCCGCGAACCCCCGAGCAGCICGACTK^ 3300 
RPGZAGAVI6PRTPEQLDSA 

3301 CGCTGAAGGCGICCGCGATGACCCTGGACGAGOVGGCGCTGTCCGAACTGGACGACATCT 3360 
LKASAMTLDEQALSELDEIF 

3361 TCCCCGCGGtGGCCTCCGGCGGCGCGGCGCCGGAAGCCTGGTTGCAGTGAGCAC^ 3420 
PAVASG GAAPEANLQ* 



3421 AACCGAGAAAGGATACGGCTGGTGAGCGTGAAGCAGAAGTCAGCGTTGCAGGACCTGG^ 3480 



3481 GACTTCGCCAAGTGGCACGTGTGGACCAGGGXGCGGCCGT^ 3540 



3541 TACGAGCTGTTCGCCGACGACCACGAGGCdACGACCG^ 3600 



3601 TACTGGAAGCCCGGGTGCGCCGGCCTGGAGGAGGCCAACCAGGAGCTGGCGAACCAGm 3660 
3661 GCCGAGGCCGCGGGGATCAGCGAGGGCGACGAGGTGCTCGACGTCGGGTTCGGGCTCGGC 3720 



3721 GCGCAGGACTTCTTCTGGCTCGACCTGCAGCCAGCT 3756 
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1 CGGGTTGCCGCACATCGCGCTGGGGAGATTCTTTGAATTTCGCCCGTAGCACCGACCTGG 60 



61 AAAGCGAGCAAAT6CTCCGGT6AATGGGATCAGTGATTCCCCGCGTCAATTGATCACCCT 120 

VNGISDSPRQLITL 



121 TCTGGGCGCTTCCGGCTTCGTCGGGAGCGCGGTTCTGCGCGAGCTGCGCGACCACCCGGT 180 
LGASGFVGSAVLRELRDHPV 

181 CCGGCTGCGCGCGGTGTCCCGCGSCGGAGCGCCCGCGGTTCCGCCCGGCCCCGCGGAGGT 240 
RLRAVSRGGAPAVPP GAAEV 



241 CGAGGACCTGCGCGCCGACCTiSCTGGAACCGGGCGGGGCCGCCGCCGCGATCGAGGACGC 300 
ED LRAOtLEPGRAAAAIEDA 



301 CGACGTGATCGTGCACCTGGTGGCGCACGaUK:6GGCGGTTCCACCT6GCGCAGCGC»^ 360 
0 V 1 VHI.VAHAAG6 S TWRSA T 

361 CTCCGACXCGGAAGCCGAGCGGGTCAACGTCGGCCIGATGCACGACCTCGTCGGCGCGCT 420 
SDPEAERVHVGLMHD LVGA L 



• ••••• 

421 GCACGATCGCCGCAGGTCGACGCCGCCCGTGTTGCTCTACGCGAGCACCGCACAGCCCGC 480 
BDRRRSTPP VL LY ASTAQAA 

481 GAACCC6TCGGa«5CCAGCAGGTACGCCCAGC»GAAG^^ 540 
NPSAASRYAQQKTBAERILR 

541 CAAAGCCACCGACG AGG6CCGGGTGCGCGGCGTG ATCCTGCGGCTGCCCGCG6TCTACGG 600 
KATDEGRVRGVIIiRLPAVYG 



601 CCAGAGCGGCCCGTCCGGCCCCATGGGGCGGGGCGTGGTCGCAGCGATGATCCGGCGTGC 6 60 
QSGPSGPMGRGVVAAMIRRA 

661 CCTCGCOSGCGAGCCGCTCACCATGTGGGAa^CGGCGGCGTGCGCCGCGA 720 
L A G E P L T M W B p 6 6 V R R D L Li^H. 

721 C6TCGA66AC6t66CCACC6C6TTCGGC6CC6C6CTG6AGCACCAC6AC6^^ 780 
VEDVATAFAAALEHRDALA 6 



. • • • ■ • 

781 CG6CACGTG6GCGCT66GC6CCGACCGATCCGAGCCGCTCGGCGACATCTTCCGGGCC6T 840 
GTWALGADRSEPLGDIFRAV 



841 CTCCGGCAGCGTCGCCCGGCA6ACCGGCAGCCCCGCCGTC6ACGTGGTCACCGTGCCCGC 900 
S6SVARQT6SPAVDVVTVPA 

901 GCCCGAGCACGCCGAG6CCAACGACTTCCGCAGC6ACGACATCGACTCCACCGAGTTCCG 960 
PEHAEAM DPRSDDIDSTEFR 
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961 CAGCCGGACCGGCTGGCGCCCCCGGGTTTCCCTCACCGACGCCATCGACCGGACGGTGCC 102 0 
S RTGWRP RVS LTDG I DRTVA 

1021 CGCCCTGACCCCCACCGAGGAGCACTAGTGCGGGTACTGCTGACGTCCTTC^^ 1080 
ALTPTBEH* 

VRVLLTSFAHR 

1081 ACGCACTTCCAGGGACTGGTCCCGCT6GC6T6GGC6CTGC6aiCC6C6^ 1140 
TBPQGLVPLAHALRTA6HDV 

1141 CGCGT6GCC6CCCAGCCCGCGCTCACC6ACGCG6TCATC6GCGCCGG7CTCACCGC 1200 
RVAAQPALTDAVI6AGLTAV 

1201 CCCGTCGGCTCCGACCACCGGCTGTTCGACATCGTCCCGGAAGTCGCCGCTCA6GTGCAC 1260 
PV6SDH RLFDIVPEVAAQVH 

1261 CGCTACTCCTTCTACCTGGACTTCTACOVCCGCGAGCAGGAGCTGCACTCGTGGGAGTO 1320 
RYSFYLDFYHREQELBSHEF 

1321 CTGCTCGGCATGOIGGAGGCCACCTCGCGGTGGGTATACCCGGTGGTCAACAACGAC^ 1380 
LLGMQEATSRHVYPVVMHDS 

1381 TTCGTCGCCGAGCTGGTCGACTTCGCCCGGGACTGCklGTCCTGACCTGGTGCT^ 1440 
FVAELVDFARDW.RPDLVLHE 

1441 CCGTTCACCnCGCCGGCGCCGTCGCGGCCCGGGCCTGCGGAGCCGCGCACGCCCGGCTG 1500 
PFTFA .GAVAARAC6AAHARL 

1501 CTGTGGGGCAGCGACCTCACCGGCTACTTCCGCGGCCGGT7CCAGGCGCAACGCCTGCGA 1560 
LHGSDLTGYFRGR FQ AQRLR 

1561 CGGCCGCCGGAGGACCGGCCGGACCCGCTGGGCACGTGGCTGACCGA6G7CGCGGGGCGC 1620 
R P P E D R P D P I. 6 T « I. 7 E V. A 6 R 

1621 na^GCGTCGAAnCGGOSAGGACCTCGCGGTC^^ 1680 
FGVEFGEDLAV6Q WSVDQLP 

16B1 CCGAGTTTCCGGCTGGACACCGGAATGGAAACCGTTGTCGCGCGGACCCTGCCCTACAAC 1740 
P S F R L D T 6 METVVART L P YN 

1741 GGCGCGTCGGTGGTTCCGGACTGGCTOU^AAGGGCAGTGCGACTCGACGCATCTGCATC 1800 
GAS VVPDHL KKGSATRRICI 

1801 ACCGGAGGGTTCTCCGGACTCGGGCTCGCCGCCGATGCCGATCAGTTCGCGCGGACGCTC 1860 
T6GF S G L6 LAADA D QF ART L 
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18 61 GCGCJ^CTCGCGCGATTCGATGGCGAAATCGTGGTTACGGGTTCCGGTCCGGATACCTCC 1920 
AQLARFOGEIVVTGSGPDTS 



1921 GCG6TACCGGACAA»TTCGTTTGGTGGATTTCGTTCC6ATGGGCGTTCTGCTCCAGAAC 1980 
AVPDNIRLVD FVPMGVLLQN 



19 81 T6CGCGGCGATCATCCACCACGGCGGGGCCG6AACCTGG6CCACGGCACTGCACCACGGA 2040 
CAAIXHHGGAGTHATALHRG 



2041 ATTCCGCAAATATCAGTTGCACATGAATGGGATTGCA7GCTACGC6GCCAGCAGACCGCG 2100 
IPQISVAHEffDCMLRGQQTA 



2101 6AACTG66CGCGGGAATCTACCTCCGGCCGGACGAGG7C6A7GCCGACTCA7TGGCGAGC 2160 
ELGAQIYLRPDEVDADS^ XiAS 



2161 GCCXTCACCCAGGTGGTCGAGGACCCCACCTACACCGAGAACGCGGTGAAG^ 2220 
ALTQVVEDPTYTENAV KLRE 



2221 GAGGCGCTGTCCGACCCGACGCC6CA6GAGATCGTCCCGCGACTGGA6GAACTCACGCGC 2280 
EALSDPT PQEIVPRLEELTR 

2281 CGCaiCGCCGGCTAGCGGTnCCGACCGACAAGTCCGTCCGACAGCACACCrCCG 2340 
R H A G * 



2341 AGCAGGGATGTAOSAGGGCGGGTTCGCCGAGCTnACGACCGGTTCTACCGCGG^ 2400 
MYEGGFAELYDRFYRGRG 



2401 CAAGGACTACGCGGCCGAGGCCGCGCAGGTC6C6C6GCTGGTCA6AGACCGCCT6CCCTC 24 60 
K D Y AAEA AQVARLVRDRLP S 



2461 GGCTTCCTCGCrrGCTCGACGTGGCCTGCGGGACCGGCACCCACCTGCGCCGGTTCGCCGA 2520 
AS S LLDVA CG T GTH LRRFAO 



2521 CCTCTTCGACGACGTGACCGGGCTGGAGGTGfCGGGGGGGAT^ 2580 
« t P D O V T 6 L E L 5 A A M X £ V A R P 

2581 GCAGCTCGGCGGCATCCCGCTGCTGCAGGGCGACATGCGCGACTTCGCGCT^ 2640 
QLG6IPVLQGDMRDFALD.RE 



2641 GTTCGAC6CCGTCACCTGCATGTTCAGCTCCATCGGGCACATGCGCGACGGCGCCGAGCT 2700 
FDAVTCHFSSZGBMRDGAEL 



2701 GGACCAGGCGCTGGa;TCCTTCGCCCGCC:ACCTCGCCCCCGGCGGCGTCGTGGTGGTCG^ 2760 
OQALASFARHLAPGGVVVVE 



2761 ACCGTGGTGGTTCCCGGAGGACTTCCTCGACGGCTACGTGGCCGGTGACGTGGTGCGCGA 2820 
PWWFP EDFLDGYVAGDVVRD 
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2821 CGGCGACCTGACGATCTCGCGfeGTCTCGCACTCCGTGCGCGCCGGCGGCGCGACCCGGAT 2880 
GDLT ISRVSHS VRAGGATRM 

• 

2881 GGAGATCCACTGGGtCGTGGCCGACGCGGTGAACGGTCCGCGGCACCACGTGGAGCACTA 2940 
EIHWVVA .DAVNGPRHHVEHY 

2941 CGAGATCACGCTCTTCGAGCGGCAGCAGTACGAGAAGGCCTTCACCGCGGCCGGTTGCG^ 3000 
EITLFERQQYEKAFTAAGCA 

3001 TGTGCAGTACCTGGAGGGCGGACCCTCCGGACGCGGGTrGTTCGTCGGTGTGCGCGGATG 3060 
VQYLEGGPSGRGLFVGVRG* 

• • • • • • 

3061 ACCCGTGCGTCGCGTTTrCCGTTCCTGGCACAGGTGATCCGCTCCACGGGCCCTTTCCCC 3120 

• ••••• 

3121 GCCGTGACCGGACCCTTACAGTGAGTGCGGGTCTTGATCGACAACGCCCGGCGGCAGCAA 3180 

3181 GCGGAGCCGTCGACGACACCGCAGGGASACTCGATGGGTGATCGGACCGGCGA^ 3240 

MGDRT6DRT 



• • • • 

3241 ATTCC6GAAXCCTCGCAGACCGCAACGCGTTTCCTGCTC6GCGACGGCGGAATCCCCACC 3300 
I PESSQTATRPLLGDGGIPT 

3301 GCCACGGCGGAAACCCACGACTGGCTGACCCGCAACGGCGCCGAGCAGCGGCTCGAGGTG 3360 
ATAETBDHLTR N G h Z Q K h E V 

3361 GCGCGCGTGCCGTTOIGCGCCATGGACCGCTGGTCGTTCXAGCCCGAGGACGGCAGGCTC 3420 
ARVPFSAMDRWSFQPEDGRL 

3421 GCCCACGAGTCCGGGCGCTTCTTCTCCATCGAGGGCCTGCACGTGCGGACGAACTTCGGC 3480 
ABESGRFFSIE6L H.VRTN FG 



. • • • 

3481 TGGCGGCGGGACTGGATCCAGCCCATCATCGTGCAGCCCCAGATCGGCTTCCTCGGCCTC 3540 
W R R D W I Q P I I V Q P E I G F I. G L 

f' ' ' 

3541 ATCGTCAAGGAGTTCGACGCTGTGCTGOICGTGCTGGCGCA^ 3600 
I VKE FD G V LH V LAQ AKA E P G 

3601 AACATOACGCCGTCOVGCTCTCCCCGACCCTGCAGGOGA^ 3660 
NINAVQLSPTLQATR S H Y T G 

3661 GTCCACCGCGGCTCGAAGGTCrGGTTa^TCGAGTACTTCAACGGCAre 3720 
VHRGSKVRFIEYFHGTRPSR 

3721 ATCCTOSTCWCGTGCTCCAGTCCGAGCAGGGCGCGTGGTTCCTGCGCAA 3780 
I LV DVLQS EQGAWFLRKRH R 
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3781 AACATGGTCGTCGAGGTGTTCGACCACCTGCCCGAGCACCCGAACTTCCGGTGGCTGACC 3840 
NMVVEVFDDLPEHPNFRWLT 



3841 GTCGC6CAGCTGCGG6CGATGCTGCACCACGACAACGTGGTGAACATGGACCTGCGCACC 3900 
VAQLRAMLH HDNVVNMDL RT 



, • • • • • 

3901 GTGCTGGCCTGCGTCCCGACCGCCGTGGAGCGGGACCGGGCCGACGACGTGCTCGCGCGC 3960 
VLACVP TAVBRDRADDVLAR 



39 61 CTGCCCGAGGGCTCGTTCCAGGCCCGGCTGCT6CACTCGTTCATCGGCGCGGGCACCCCG 4020 
LPEGSFQARLLHSFIGAGTP 



4021 GCauiCAACATGAACAGCCTGCTGAGCT6GATCTCCGACCTGCGCGCC»GTC 4080 
ANNMHS LX.SHXSDVRARREF 



4081 GTGCAGCGCGGCCGCCCGCTGCCCGACATCGAGCGCAGCGGGTGGATCCGCCGCGAC^ 4140 
VQR GRP LPDIERSGWXRRDD 

4141 GGaiTCGAG»CGA66A6AAGAAGTACTTC6ACGTCtTCGGCGTCACGGT 4200 
6IEHEE KKYPDVFGVTVATS 



4201 GACCGCGAGGTCAACTCGTGGATGCAGCCGCTGCTCTCGCCCGCCAACAAO^GC 4260 
DREVMS HMQPLLS PA NH6LL 

4261 GCCCTGCTGGTCAAGGAOITCGGCGGCACGTTGCACGCGCTCGTGCAGCTGCGCACCGAG 4320 
ALLV KD I6GTLHALVQLRTE 



4321 GCGGGCGGGAT6GACGTCGCCGAGCTGGCGCCTACGGTGCACTGCCAGCCCGACAACTAC 4380 
AG GMDV AE LAP TVHCQP DNY 



• . . . • . 

4381 GCCGACGCGCCCGAGGAGTTCCGACCGGCCTATGTGGACTAC6TGTTGAACGTGCCGCGC 4440 
ADAPEEPRPAYVDYVLNVPR 



4441 TCGCAGGTCXGCTACGACGCATGGCACrCCGAGGAGGGCGGCCGGTTCTAC^ 4500 
S Q V R Y D A H B S E E G G R F Y R N E 

4501 AACCGGTACATGCTGATCGAGGTGCCCGCCGACTTCGACGCMGTGCCGCTCCCGAC^^ 4560 
NRYMLXEVPADFDASAAPDR 



4561 CGGTGGATGACCn^CGACCAGATCACCTACCTGCTCGGGCACAGCCAC^^ 4620 
RWMTFDQITYI.L6HSHYVNI 



4621 CACGTGCGCAGCATCATCGCGTGO^CCTCGGCCGTCrACACCAGGACCGCCGGATGAAAC 4680 
HVRS I lACASAVYTRT'AG* 

M K R 



m. • • • • • 

4681 GCGCGCTGACCGACCTGGCGATCTTCGGCGGCCCCGAGGCATTCCTGCACACCCTCTACG 4740 
ALTDLAIFGGPEAFLHtLYV 
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474X TGGGCACGCCGACCGTCGCSGGACCGGWGCGOTCTXCGCCCGCCTGGAGTGG^ 4B00 
GRPTV60RERFFARLEHALN 



4801 ACAACAACTGGCTG ACCAACGGCGGACCACTG6TGCGCGAGTTCGAGGGCC6GGTCGCCG 4 860 
MHWLTNGGPLVREFEGRVAD 



4861 ACCTGGCGGGTGTCCGCCACTGCGTGGCCACCTGCAACGCGACGGTCGCGCTGCAACTGG 4 920 
LAGVRHCVATCHATVALQLV 



4921 TGCTGCGCGCGAGCGACGTGTCCGGCGAGGTCGTCATGCCTTCGATGACGTTCGC^ 4980 
L. RASDVSGEVVMPSMTFAAT 



4981 CCGCGCACGC6GC6AGCT66CTGGG6CTG6AACC66TGTTCT6CGAC« 5040 
AHAASHL6LEPVFCDVDPST 



5041 CCGGCCTGCTCGACCCCGAGCACGTCGCGTCGCTGGTCACACCGC^^ 5100 
6LLDPEBVASLVTPRTGAIX 



5101 TCGGCGTGCACCTCTGGGGOUSGCCCGCTCCGGTCGAGGCGCTGGAGAAGATCGCC^ 5160 
GVBLNGRPAPVEALE KZAAE 



5161 AGCAOU^GTCAAACTCTTCTTCGACGCCGCGCAaSCGCTGGGCTG^^ 5220 
RQ VRZ.FFDAAHALGCT AG6R 

5221 GGCCGGTCGGCGCCnCGGCAACGCCGAGCTGTTCAGCTTCCACGCCACG^^ 5280 
PVGAFGNAEVFSFHATKAVT 



5281 CCTC6TTCGAGGGCGGCGCCATCGTCACCGACGACGGGCTGCTGGCCGACCGCATCCGCG 5340 
SF EGGAIVTDDGLLADRIRA 



5341 CCATGCACAACTTCGGGATCGCACCGGACAAGCTGGTGACCGATGTCGGCACCAACGGCA 5400 
MHNFGIAPDKLVTDVGTHGK 



5401 AGATGAGCGAGTGCGCCGCGGCGATGGGCCTCACCTCGCTCGACGCCTTCGCCGA^ 5460 
, M S E C A A A M G L T 5 L D A F A E T R 

5461 G6GIGCACAACCGCCTCAACCAC6a;CTCTACTCCGAC6AGCTCCGCGACGTGCGCG^ 5520 
VHHRLNHALYS DELRDVRGI 



5521 TATCCGT6CACGC6TTCGATCCTG6C6AGCAGAAC2yu:TACCAGT^ 5580 
SVHAFDPGEQNNYOyVZI SV 



5581 TGGACTCCGCGGCCACCGGCATCGACCGCGACCAGTTGCAGGCGATCCTG^^ 5640 
D S AATGZDRDQLQAZLRAER 



5641 AGGTTGTGGCACAACCCTACTTCTCCCCCGGGTGCCACCAGATGCAGCCCTACCGG^^ 5700 
VVAOPYFSPGCRQMQPY R T E 
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5701 AGCCGCCGCTGCGGCtGGAGAACACCGAACAGCTCTCCGACCGGGTGCTCGCGCTGCCCA 5760 
PPLRLCNTEQLSDRVLALPT 



5761 CCGGCCCCGCGGTGTCCAGCGAGGACATCCGGCGGGTGTGCGACATCATCCGGCTCGCCG 5820 
GPAVSSEDIRRVCDIIRLAA 



5821 CCACCAGCGGCGAGCTGATCAACGCGCAATGGGACCAGAGGACGCGCAACGGTTCGT6AC 5880 
TSGELINAQWDQRTRNGS* 



5881 GACCTGCGCCACAAGTGCCAGGAGGTTCGCTCCCCGATGiuiCACAACTCCT 5940 

MMTTRTAT 



5941 GCCCAGGAAGCGGGGGTCGCCGACGCGGCGCGCCCGGACOTCGACCGGCGGGC^ ' 6000 
AQEAGVADAARPDVDRRAVV 



6001 CGGGC6CTGAGCTC66AGGTCTCCC6C6TCACCG6CGCC66TGACGGTGACGCCCAC6T 6060 
RALSSEVSRVTG A6D GDAHV 



6061 CAGGCCGCCC6GCTCGCCGACCTCGCCGCGCACTAC6GGGCGCACCCGTTCACGCCGCTG 6120 
QAARIfADLAAHYGAHPFTPL 



6121 GAGCAGACGCGTGCGCGGCTCGGCCTGGACCGCGCGGAGTTCGCCCACCTGCTCGACCTG 6180 
EOTRAR IiGLDRAEFAHLLDL 



6181 TTCGGCCGCATCCCGGACCTGGGCJICCGCGGTGGAGCACGGTCCGGCGGGCAAGTACT^ 6240 
FGRZPDLGTAVEHGPAGKYH 



6241 TCCAACACGATCAAGCCGCTGGACGCCGCAGGCGCACTGGACGCGGCGGTCTACCGCAAG 6300 
SNTIKPLDAAGALDAAVYRK 



6301 CCTGCCTTCCCCTACAGCGTCGGCCTGTACCCCGGGCCGACGTGCATGTTCCGCTGCCAC 6360 
PAFPYSVGLYPGPTCMFRCH 



6361 TTCTGCGTGGGGGTGACCGGTGCCCGCTACGAGGCCGCATCGGTCCCGGCGGGCAAC^ 6420 
F C V R V T G A R Y B A A S V P A G N E 



6421 AC6CTGGCCGCGA7CATCGAC6A66T6CCCACG6ACAACCCGAAGGC6 6480 
T I.AAI IDEVPTDNPKAMYNS 



6481 GGCGGGCTCGAGCCGCTGACCAACCCCGGTCTCGGCGAGCTGGTGTCGCACGCCGCC^ 6540 
GGLBPLTNP6LGELVSBAA6 



6541 CGCGGTTTCGACCTCACCGTCTACACCAACGCCTTCGCCCTCACCGAGCAGACGCT^ 6600 
RGFDLTVYTNAFAX.TE QTLN 



6601 CGCCAGCCCGGCCTGTGGGAGCTGGGCGCGATCCGCACGTCCCTCTACGGGCTGAAC^ 6660 
RQPGLWELG AIRTSLYGL NN 
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6661 GACGAGTACGAGACGACCACCGGCAAGCGCGGCGCTTTCGAACGCGTCAAGAAGAACCTG 6720 
DE YETTTGKRGAFERVKKHL 

6721 CAGGGCTTCCTGCGGATGCGCGCCGAGCGGGACGCGCCGATCCGGCTCGGCrTCAAC^^ 6780 
QGFLR HIIAEROAPIRLGPNH 

6781 ATCATCCTGCCGGGACGGGCCGACGGGCTCACCGACCrCGTCGACTTCATCGCCGAGn 6840 
IILPG RADRLTDLVDFIAEL 

6841 AACGAGTCCA6CCC6CAACGGCCGCTGGACTTC6TGACGGTGCGCGA6GA 6900 
NESSPQRPLDFVTVREDYSG 

6901 CGCGACGACGGCCGGC7GTC6GACTCCGAGCGCAACGAGCTGCGCGAGGGCCTGGTGCGG 6960 
RD0GRLSD5ERHELREGLVR 

6961 TTCGTCGACTACGCCGCCGAGCGGACCCCGGGCATGCACATCGACCTGGGCTACGCC^ 7020 
FVOYAAE RTPGHHIDLGYAX. 

7021 GAGAGCCTGCGGCGGGGTGTGGACGCCGAGCTGCTGCGCATCCGGCCGGAGACGATGCGT 7080 
ESLRR6VDAELLRZRPE T H R 

7081 CCCACCGCGCACCCCCAGGTCGCGGTGCAGATCGACCTGCTCGGCGACGTCT^ 7140 
P TAHPQVAV QIDLLGDVYLY 

7141 CGCGAGGCGGGCTTCCCGGAGCTGGAGGGCGCCACCCGCTACATCGCGGGCCGGGTCACC 7200 
REAGFPELEGATRY lAGRVT 

7201 CCG7CGACCAGCCTGCGCGAGGTGGTG6AGAACTTCGTGCTGGAGAACGAG6GCGTGCAG 7260 
PSTSL REVVENFVLEKE GVQ 

7261 CCCCGCCCCGGCGACGAGTACTTCCTCGACGGCrrTCGACCAGTCGGTGACCGCACG^ 7320 
PRPGD EYFLDGFDQSVTARL 

7321 AACCA6CTCGAACGAGACATC6CCGAC6GGT66GA66ACCACCG» 7380 
N Q L B R D I A D 6 If E D R R G F t R G 

7381 AGGTGAACCGGAGTTGCGAGTACGTGAGCrrGGCGGTGGCGGGCGGTTTCGAGnCACCCC 7440 
R* VAG6FEFTP 

7441 CGACCCGAAGCAGGACCGGCGGGGCCTGTTCGTGTCTCC6CTGCAGGACGAGGCGTTCGT 7500 
DPKQDRRGLFVSPLQDEAFV 

7501 GGGCGCGGTGGGCCATCGGTTCCCCGTCGCCCAGATGAACCACATCGTCTCCG^ 7560 
G AVGHRFPVAQHNaiVSARG 

7561 CGTGCTGCGCGC^CTGCACnCACCACCACCCCGCCGGGG^^ 7620 
VLR6LHFTTTPPG Q C K Y V Y C 
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7621 CGCGCGCGGCCGGGCGCTCGACGTCATCGTCGACATCCGGGTCGGCTCGCCGACGTTCGG 7680 
ARGRALDVZVDIRVGSPTFG 

7681 GAAGTGGG AC6CGGTGGAGATGGACACCGAGCACTTCCGGGCGGTC7ACTTCCCCAGGG6 7740 
KHDAVEMDTEHFRAVYFPR6 

7741 CACCGCGCACGCCrrCCTCGCGCTTGAGGACGACACCCTGATGTCGTACCTGGT^^ 7800 
TAHAFLALEDDTLHSYLVST. 

7801 GCCGTACGTGGCCGAGTACGAGCAGGCGATCGACCCGTTCGACCCCGCGCTGGGTCTGCC 7860 
PYVAEYEQAIDPFDPALGLP 

7861 GTGGCCCGCGGACCTGGAGGTCGTGCTCTCCGACOSCGACACGGTGGCCGTGGAC^^ 7 92 0 

HPADLEVVLSDRDTVAVDLE 

7921 GACC6CCA66CG6CGAG6GAT6CTGCCC6ACTACGCC6ACT6CCTCGGC6AGGAGC^^ 7980 
TARRRGML PDYADCLGEEPA 

7981 CA6CACC6GCA6GTGAC6GCTCCCGAGCACGATCTGTTC6AAGT66C6CAGGCG 8040 
S T G R ♦ 

8041 G7CGCGGTCGA 8051 
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